peach.tl.pathway_associations

peach.tl.pathway_associations#

peach.tl.pathway_associations(adata, *, pathway_obsm_key='pathway_scores', obsm_key='archetype_distances', obs_key='archetypes', test_method='mannwhitneyu', fdr_method='benjamini_hochberg', fdr_scope='global', test_direction='two-sided', min_logfc=0.01, min_cells=10, comparison_group='all', verbose=True, **kwargs)[source]#

Test pathway activity associations with archetypal assignments.

Performs Mann-Whitney U tests to identify pathways with significantly different activity between each archetype and all other cells.

Parameters:

adata (AnnData) –
Annotated data object with:
- obsm[pathway_obsm_key] : Pathway scores [n_cells, n_pathways]
- obsm[obsm_key] : Archetype distance matrix
- obs[obs_key] : Archetype assignments
- uns[pathway_obsm_key + '_pathways'] : Pathway names (optional)
pathway_obsm_key (str, default: "pathway_scores") – Key in adata.obsm containing pathway activity scores.
obsm_key (str, default: "archetype_distances") – Key in adata.obsm containing archetype distance matrix.
obs_key (str, default: "archetypes") – Column in adata.obs containing archetypal assignments.
test_method (str, default: "mannwhitneyu") – Statistical test method.
fdr_method (str, default: "benjamini_hochberg") – FDR correction method.
fdr_scope ({'global', 'per_archetype', 'none'}, default: 'global') – Scope of FDR correction.
test_direction (str, default: "two-sided") – Direction of statistical test.
min_logfc (float, default: 0.01) – Minimum effect size threshold (mean_diff for pathways).
min_cells (int, default: 10) – Minimum cells required per archetype.
comparison_group (str, default: 'all') – Comparison group: ‘all’ or ‘archetypes_only’.
verbose (bool, default: True) – Whether to print progress.

Returns:

Results with columns:

pathway : str - Pathway name
archetype : str - Archetype identifier
n_archetype_cells : int - Cells in archetype
n_other_cells : int - Cells in comparison
mean_archetype : float - Mean score in archetype
mean_other : float - Mean score in others
mean_diff : float - Mean difference (primary effect size)
log_fold_change : float - Alias for mean_diff
statistic : float - Test statistic
pvalue : float - Raw p-value
fdr_pvalue : float - FDR-corrected p-value
significant : bool - Whether significant
direction : str - ‘higher’ or ‘lower’

Return type:

pd.DataFrame

Notes

Pathway scores (from AUCell, pySCENIC, etc.) represent activity levels, not expression counts. Mean difference is more interpretable than log fold change for these scores.

Examples

>>> # Basic usage
>>> results = pc.tl.pathway_associations(adata)
>>> # Filter for specific pathway categories
>>> metabolism = results[results["pathway"].str.contains("METABOLISM", case=False)]
>>> # Top pathways per archetype
>>> for arch in results["archetype"].unique():
...     top = results[(results["archetype"] == arch) & (results["significant"])].nlargest(5, "mean_diff")
...     print(f"{arch}: {top['pathway'].tolist()}")

peach.tl.pathway_associations

Contents

peach.tl.pathway_associations#