spamosaic.preprocessing.lsiTransformer
- class spamosaic.preprocessing.lsiTransformer(n_components: int = 20, drop_first=True, use_highly_variable=None, log=True, norm=True, z_score=True, tfidf=True, svd=True, use_counts=False, pcaAlgo='arpack')[source]
Latent Semantic Indexing (LSI) pipeline for dimensionality reduction.
- Parameters:
n_components (int) – Number of SVD components.
drop_first (bool) – Whether to drop the first principal component.
use_highly_variable (bool or None) – Whether to subset to highly variable features.
log (bool) – Whether to apply log1p transformation.
norm (bool) – Whether to normalize features.
z_score (bool) – Whether to z-score features.
tfidf (bool) – Whether to apply TF-IDF normalization.
svd (bool) – Whether to apply SVD transformation.
use_counts (bool) – Use
.layers['counts']instead of.Xfor data.pcaAlgo (str) – SVD backend (e.g.,
'arpack').