spamosaic.utils
Utility functions for SpaMosaic.
Includes configuration loader, batching helpers, nearest-neighbor wrappers, clustering/UMAP utilities, and small AnnData helpers.
- class spamosaic.utils.Config(dictionary)[source]
Bases:
objectA wrapper that recursively converts a nested dictionary to an object with attribute-style access.
- Parameters:
dictionary (dict) – Input configuration dictionary.
- __dict__
Internal storage for nested configuration items, enabling attribute access.
- Type:
dict
- spamosaic.utils.check_batch_empty(modBatch_dict, verbose=True)[source]
Check that each batch contains at least one measured modality.
- Parameters:
modBatch_dict (dict) – Mapping
{modality_name -> list[AnnData or None]}.verbose (bool, default=True) – Whether to print batch composition.
- Returns:
For each batch index, a list of modality indices present in that batch.
- Return type:
list of list of int
- Raises:
ValueError – If any batch is completely empty.
- spamosaic.utils.clustering(adata, n_cluster, used_obsm, algo='kmeans', key='tmp_clust')[source]
Cluster cells using k-means or Mclust and store labels in
.obs.- Parameters:
adata (AnnData) – Input data with an embedding in
.obsm[used_obsm].n_cluster (int) – Number of clusters.
used_obsm (str) – Key in
.obsmto cluster on.algo ({'kmeans', 'mclust'}, default='kmeans') – Clustering algorithm to use.
key (str, default='tmp_clust') – Column name in
.obsto store cluster labels.
- Returns:
Annotated object with cluster assignments in
.obs[key].- Return type:
AnnData
- spamosaic.utils.dict_map(_dict, _list)[source]
Map a list of keys using a dictionary.
- Parameters:
_dict (dict) – Mapping dictionary.
_list (list) – List of keys to map.
- Returns:
List of mapped values.
- Return type:
list
- spamosaic.utils.flip_axis(ads, axis=0)[source]
Flip the spatial coordinates of AnnData objects along a specified axis.
- Parameters:
ads (list of AnnData) – Data objects to modify (in-place).
axis ({0, 1}, default=0) – Axis to flip (0 for x, 1 for y).
- Return type:
None
- spamosaic.utils.get_barc2batch(modBatch_dict)[source]
Create a mapping from cell barcodes to their batch indices.
- Parameters:
modBatch_dict (dict) – Mapping
{modality_name -> list[AnnData or None]}.- Returns:
Dictionary
{barcode -> batch_index}.- Return type:
dict
- spamosaic.utils.get_umap(ad, use_reps=[])[source]
Compute UMAP embeddings for specified representations and store them in
.obsm.- Parameters:
ad (AnnData) – Input object.
use_reps (list of str, default=[]) – Keys in
.obsmto compute UMAP for (e.g.,['X_pca']).
- Returns:
The same object with additional
.obsm[f'{rep}_umap']for eachrep.- Return type:
AnnData
- spamosaic.utils.load_config(filepath)[source]
Load a YAML configuration file into a
Configobject.- Parameters:
filepath (str) – Path to the YAML configuration file.
- Returns:
Parsed configuration object.
- Return type:
- spamosaic.utils.mclust_R(adata, num_cluster, modelNames='EEE', used_obsm='emb', random_seed=2020)[source]
Run R’s Mclust (via rpy2) on an embedding to obtain soft clustering.
- Parameters:
adata (AnnData) – AnnData with embedding stored in
.obsm.num_cluster (int) – Desired number of clusters.
modelNames (str, default='EEE') – Covariance structure model in Mclust.
used_obsm (str, default='emb') – Key in
.obsmto use for clustering.random_seed (int, default=2020) – Random seed for both NumPy and R.
- Returns:
Annotated object with a categorical column
obs['mclust'].- Return type:
AnnData
- spamosaic.utils.nn_approx(ds1, ds2, norm=True, knn=10, metric='manhattan', n_trees=10, include_distances=False)[source]
Perform approximate nearest-neighbor search using Annoy.
- Parameters:
ds1 (np.ndarray) – Query data of shape
(N1, D).ds2 (np.ndarray) – Reference data of shape
(N2, D).norm (bool, default=True) – Whether to L2-normalize
ds1andds2before indexing/search.knn (int, default=10) – Number of nearest neighbors to retrieve per query.
metric (str, default='manhattan') – Distance metric for Annoy (e.g.,
'manhattan','euclidean').n_trees (int, default=10) – Number of trees in the Annoy index (trade-off between speed/accuracy).
include_distances (bool, default=False) – If
True, also return distances.
- Returns:
If
include_distancesisFalse, returns indices array of shape(N1, knn). Otherwise returns(indices, distances)with the same shape.- Return type:
np.ndarray or tuple of (np.ndarray, np.ndarray)
- spamosaic.utils.plot_basis(ad, basis, color, **kwargs)[source]
Wrapper around
scanpy.pl.embeddingwith warning suppression.- Parameters:
ad (AnnData) – Annotated data object.
basis (str) – Name of the embedding basis (e.g.,
'umap'or'spatial').color (str) – Column in
.obsto color by.**kwargs – Additional keyword arguments passed to
scanpy.pl.embedding.
- Return type:
None
- spamosaic.utils.reorder(ad1, ad2)[source]
Align and reorder two AnnData objects to their shared barcodes.
- Parameters:
ad1 (AnnData) – First object.
ad2 (AnnData) – Second object.
- Returns:
Views of
ad1andad2containing only shared barcodes, with matching order.- Return type:
tuple of (AnnData, AnnData)
- spamosaic.utils.split_adata_ob(ads, ad_ref, ob='obs', key='emb')[source]
Split a merged AnnData object’s observations/embeddings back to per-batch objects.
- Parameters:
ads (list of AnnData) – Target AnnData objects to receive splits.
ad_ref (AnnData) – Source AnnData containing concatenated
.obsor.obsm.ob ({'obs', 'obsm'}, default='obs') – Which attribute to split.
key (str, default='emb') – Key in
.obsor.obsmto split and assign.
- Return type:
None
Functions
Check that each batch contains at least one measured modality. |
|
Cluster cells using k-means or Mclust and store labels in |
|
Map a list of keys using a dictionary. |
|
Flip the spatial coordinates of AnnData objects along a specified axis. |
|
Create a mapping from cell barcodes to their batch indices. |
|
Compute UMAP embeddings for specified representations and store them in |
|
Load a YAML configuration file into a |
|
Run R's Mclust (via rpy2) on an embedding to obtain soft clustering. |
|
Perform approximate nearest-neighbor search using Annoy. |
|
Wrapper around |
|
Align and reorder two AnnData objects to their shared barcodes. |
|
Split a merged AnnData object's observations/embeddings back to per-batch objects. |
Classes
A wrapper that recursively converts a nested dictionary to an object with attribute-style access. |