peach.tl.pathway_associations

peach.tl.pathway_associations#

peach.tl.pathway_associations(adata, *, pathway_obsm_key='pathway_scores', obsm_key='archetype_distances', obs_key='archetypes', test_method='mannwhitneyu', fdr_method='benjamini_hochberg', fdr_scope='global', test_direction='two-sided', min_logfc=0.01, min_cells=10, comparison_group='all', verbose=True, **kwargs)[source]#

Test pathway activity associations with archetypal assignments.

Performs Mann-Whitney U tests to identify pathways with significantly different activity between each archetype and all other cells.

Parameters:
  • adata (AnnData) –

    Annotated data object with:

    • obsm[pathway_obsm_key] : Pathway scores [n_cells, n_pathways]

    • obsm[obsm_key] : Archetype distance matrix

    • obs[obs_key] : Archetype assignments

    • uns[pathway_obsm_key + '_pathways'] : Pathway names (optional)

  • pathway_obsm_key (str, default: "pathway_scores") – Key in adata.obsm containing pathway activity scores.

  • obsm_key (str, default: "archetype_distances") – Key in adata.obsm containing archetype distance matrix.

  • obs_key (str, default: "archetypes") – Column in adata.obs containing archetypal assignments.

  • test_method (str, default: "mannwhitneyu") – Statistical test method.

  • fdr_method (str, default: "benjamini_hochberg") – FDR correction method.

  • fdr_scope ({'global', 'per_archetype', 'none'}, default: 'global') – Scope of FDR correction.

  • test_direction (str, default: "two-sided") – Direction of statistical test.

  • min_logfc (float, default: 0.01) – Minimum effect size threshold (mean_diff for pathways).

  • min_cells (int, default: 10) – Minimum cells required per archetype.

  • comparison_group (str, default: 'all') – Comparison group: β€˜all’ or β€˜archetypes_only’.

  • verbose (bool, default: True) – Whether to print progress.

Returns:

Results with columns:

  • pathway : str - Pathway name

  • archetype : str - Archetype identifier

  • n_archetype_cells : int - Cells in archetype

  • n_other_cells : int - Cells in comparison

  • mean_archetype : float - Mean score in archetype

  • mean_other : float - Mean score in others

  • mean_diff : float - Mean difference (primary effect size)

  • log_fold_change : float - Alias for mean_diff

  • statistic : float - Test statistic

  • pvalue : float - Raw p-value

  • fdr_pvalue : float - FDR-corrected p-value

  • significant : bool - Whether significant

  • direction : str - β€˜higher’ or β€˜lower’

Return type:

pd.DataFrame

Notes

Pathway scores (from AUCell, pySCENIC, etc.) represent activity levels, not expression counts. Mean difference is more interpretable than log fold change for these scores.

Examples

>>> # Basic usage
>>> results = pc.tl.pathway_associations(adata)
>>> # Filter for specific pathway categories
>>> metabolism = results[results["pathway"].str.contains("METABOLISM", case=False)]
>>> # Top pathways per archetype
>>> for arch in results["archetype"].unique():
...     top = results[(results["archetype"] == arch) & (results["significant"])].nlargest(5, "mean_diff")
...     print(f"{arch}: {top['pathway'].tolist()}")

See also

peach.tl.gene_associations

Gene-level testing

peach._core.types.PathwayAssociationResult

Result row structure