peach.tl.compute_conditional_centroids

peach.tl.compute_conditional_centroids#

peach.tl.compute_conditional_centroids(adata, condition_column, *, pca_key='X_pca', store_key='conditional_centroids', exclude_archetypes=None, groupby=None, verbose=True)[source]#

Compute centroid positions in PCA space for each level of a categorical condition.

This function calculates the mean position (centroid) in PCA space for cells belonging to each level of a categorical variable. Useful for visualizing how different conditions (e.g., treatment phases, timepoints) relate to the archetypal structure.

Following R template patterns: - Uses ALL PCs for centroid calculation (equivalent to R’s colMeans) - Stores full PC centroid but extracts first 3 for 3D visualization - Excludes ‘no_archetype’ and ‘archetype_0’ cells by default

Parameters:
  • adata (AnnData) – Annotated data object with PCA coordinates in adata.obsm[pca_key].

  • condition_column (str) – Name of categorical column in adata.obs to group by. Examples: ‘treatment_phase’, ‘timepoint’, ‘batch’.

  • pca_key (str, default: "X_pca") – Key in adata.obsm containing PCA coordinates.

  • store_key (str, default: "conditional_centroids") – Key in adata.uns to store results.

  • exclude_archetypes (list, optional) – Archetype labels to exclude from centroid calculation. Default: [‘no_archetype’, ‘archetype_0’] (following R template). Set to empty list [] to include all cells.

  • groupby (str, optional) – Second categorical column for multi-group trajectories. If provided, centroids are computed for each (group, level) combination. Example: groupby=’response_group’ to get separate trajectories per response.

  • verbose (bool, default: True) – Whether to print progress messages.

Returns:

Dictionary with keys:

  • condition_column : str - name of the condition column

  • n_levels : int - number of unique levels

  • levels : List[str] - list of level names

  • centroids : Dict[str, List[float]] - level → full PCA coordinates

  • centroids_3d : Dict[str, List[float]] - level → [x, y, z] first 3 PCs

  • cell_counts : Dict[str, int] - level → cell count

  • pca_key : str - PCA key used

  • exclude_archetypes : List[str] - archetypes excluded

  • groupby : Optional[str] - groupby column if used

  • group_centroids : Optional[Dict] - if groupby: {group: {level: coords}}

  • group_centroids_3d : Optional[Dict] - if groupby: {group: {level: [x,y,z]}}

  • group_cell_counts : Optional[Dict] - if groupby: {group: {level: count}}

Return type:

dict

Raises:
  • ValueError – If condition_column not in adata.obs or PCA coordinates not found.

  • Stores

  • ------

  • The function stores results in AnnData:

:raises - adata.uns[store_key][condition_column] : dict: Full results dictionary as returned.

Examples

>>> # Compute centroids for treatment phase
>>> result = pc.tl.compute_conditional_centroids(adata, "treatment_phase")
>>> print(result["centroids_3d"])
{'chemo-naive': [1.2, 0.5, -0.3], 'IDS': [0.8, 1.1, 0.2]}
>>> # Then visualize with trajectory
>>> fig = pc.pl.archetypal_space(
...     adata, show_centroids=True, centroid_condition="treatment_phase", centroid_order=["chemo-naive", "IDS"]
... )
>>> # Multi-group centroids for trajectory comparison
>>> result = pc.tl.compute_conditional_centroids(adata, "treatment_phase", groupby="response_group")
>>> fig = pc.pl.archetypal_space(
...     adata,
...     show_centroids=True,
...     centroid_condition="treatment_phase",
...     centroid_groupby="response_group",
...     centroid_order=["chemo-naive", "IDS"],
...     centroid_colors={"long": "magenta", "short": "cyan"},
... )

See also

peach.pl.archetypal_space

Visualize with centroid trajectory overlay