spamosaic.utils.nn_approx
- spamosaic.utils.nn_approx(ds1, ds2, norm=True, knn=10, metric='manhattan', n_trees=10, include_distances=False)[source]
Perform approximate nearest-neighbor search using Annoy.
- Parameters:
ds1 (np.ndarray) – Query data of shape
(N1, D).ds2 (np.ndarray) – Reference data of shape
(N2, D).norm (bool, default=True) – Whether to L2-normalize
ds1andds2before indexing/search.knn (int, default=10) – Number of nearest neighbors to retrieve per query.
metric (str, default='manhattan') – Distance metric for Annoy (e.g.,
'manhattan','euclidean').n_trees (int, default=10) – Number of trees in the Annoy index (trade-off between speed/accuracy).
include_distances (bool, default=False) – If
True, also return distances.
- Returns:
If
include_distancesisFalse, returns indices array of shape(N1, knn). Otherwise returns(indices, distances)with the same shape.- Return type:
np.ndarray or tuple of (np.ndarray, np.ndarray)