spamosaic.build_graph

Graph construction utilities for spatial transcriptomics.

This module provides helpers to build spatial neighbor graphs and cross-batch mutual nearest neighbor (MNN) matches on top of AnnData objects.

spamosaic.build_graph.Cal_Spatial_Net(adata, rad_cutoff=None, k_cutoff=None, max_neigh=50, model='Radius', verbose=True)[source]

Construct spatial graph from spatial coordinates using radius or kNN method.

Parameters:
  • adata (AnnData) – Input annotated data.

  • rad_cutoff (float, optional) – Distance cutoff for radius-based graph.

  • k_cutoff (int, optional) – Number of neighbors for kNN-based graph.

  • max_neigh (int) – Maximum number of neighbors to consider.

  • model (str) – Type of graph construction: ‘Radius’ or ‘KNN’.

  • verbose (bool) – Whether to print debug info.

Return type:

None

Notes

On success, adds the adjacency matrix to adata.uns['adj'] (SciPy sparse matrix).

spamosaic.build_graph.build_intra_graph(ads, rad_cutoff, knns)[source]

Construct intra-batch spatial graphs for a list of AnnData objects.

Parameters:
  • ads (list of AnnData) – List of spatial AnnData objects. None entries are skipped.

  • rad_cutoff (float) – Distance threshold for the radius graph.

  • knns (list of int) – Maximum neighbors to query for each AnnData (same length as ads).

Return type:

None

spamosaic.build_graph.build_mnn_graph(bridge_ads, test_ads, use_rep, batch_key, knn_base=10, auto_knn=False, auto_thr=0.8, rmv_outlier=False, contamination='auto', seed=1234)[source]

Build a mutual nearest neighbor (MNN) graph across batches in one modality.

This procedure matches cells across batches using approximate kNN (Annoy) on a given embedding, returning a set of matched barcode pairs. It supports (1) bridge–bridge matches (between multi-modal batches) and (2) bridge–test matches (between a bridge batch and a single-modality batch). Optionally, it can remove spatial outliers.

Parameters:
  • bridge_ads (list of AnnData) – AnnData objects from bridge batches.

  • test_ads (list of AnnData) – AnnData objects from test batches.

  • use_rep (str) – Key in .obsm containing the embedding to search.

  • batch_key (str) – Column in .obs used only for logging/batch names.

  • knn_base (int, optional) – Base number of neighbors per side. Default is 10.

  • auto_knn (bool, optional) – If True, use determine_kSize to adapt k based on batch sizes.

  • auto_thr (float, optional) – Size ratio threshold for auto_knn. Default is 0.8.

  • rmv_outlier (bool, optional) – Whether to run remove_outlier. Default is False.

  • contamination ({'auto', float}, optional) – Outlier rate used by Isolation Forest.

  • seed (int, optional) – Random seed for stochastic components.

Returns:

Set of matched barcode pairs across batches.

Return type:

set of tuple of str

spamosaic.build_graph.determine_kSize(adi, adj, knn_base, auto_thr)[source]

Determine asymmetric k values for MNN search based on dataset sizes.

If two datasets have similar numbers of observations (the smaller-to-larger ratio is at least auto_thr), both sides use knn_base. Otherwise, the smaller side uses floor(knn_base * size_ratio) (at least 1).

Parameters:
  • adi (AnnData) – First dataset.

  • adj (AnnData) – Second dataset.

  • knn_base (int) – Base number of neighbors.

  • auto_thr (float) – Size similarity threshold, e.g. 0.8.

Returns:

(knn_adi, knn_adj) to use for the pair.

Return type:

tuple of int

spamosaic.build_graph.make_Ahat_sparse(A, improved=False, symm=False)[source]

Build a normalized adjacency matrix A_hat suitable for GNN layers.

Given a sparse adjacency A (N×N), this function optionally symmetrizes it, adds self-loops, and applies the symmetric normalization \(D^{-1/2} A D^{-1/2}\).

Parameters:
  • A (scipy.sparse.spmatrix) – Sparse adjacency matrix of shape (N, N).

  • improved (bool, optional) – If True, use a self-loop weight of 2.0 (as in “improved” GCN).

  • symm (bool, optional) – If True, force symmetry by replacing A with (A + A.T) before normalization.

Returns:

Normalized sparse matrix A_hat of shape (N, N).

Return type:

scipy.sparse.csr_matrix

spamosaic.build_graph.remove_outlier(mnn_set, ad1, ad2, contamination='auto')[source]

Filter spatial outliers from an MNN pair set using Isolation Forest.

A feature matrix is built from the concatenated spatial coordinates of both cells and their differences. Pairs predicted as outliers (label -1) are removed.

Parameters:
  • mnn_set (set of tuple of str) – Set of MNN barcode pairs (cell_id_1, cell_id_2).

  • ad1 (AnnData) – Dataset of the first cell; must contain .obsm['spatial'].

  • ad2 (AnnData) – Dataset of the second cell; must contain .obsm['spatial'].

  • contamination ({'auto', float}, optional) – Expected outlier fraction; passed to sklearn.ensemble.IsolationForest.

Returns:

Filtered set with spatial outliers removed.

Return type:

set of tuple of str

Functions

spamosaic.build_graph.Cal_Spatial_Net

Construct spatial graph from spatial coordinates using radius or kNN method.

spamosaic.build_graph.build_intra_graph

Construct intra-batch spatial graphs for a list of AnnData objects.

spamosaic.build_graph.build_mnn_graph

Build a mutual nearest neighbor (MNN) graph across batches in one modality.

spamosaic.build_graph.determine_kSize

Determine asymmetric k values for MNN search based on dataset sizes.

spamosaic.build_graph.remove_outlier

Filter spatial outliers from an MNN pair set using Isolation Forest.

spamosaic.build_graph.make_Ahat_sparse

Build a normalized adjacency matrix A_hat suitable for GNN layers.