spamosaic.utils.nn_approx

spamosaic.utils.nn_approx(ds1, ds2, norm=True, knn=10, metric='manhattan', n_trees=10, include_distances=False)[source]

Perform approximate nearest-neighbor search using Annoy.

Parameters:
  • ds1 (np.ndarray) – Query data of shape (N1, D).

  • ds2 (np.ndarray) – Reference data of shape (N2, D).

  • norm (bool, default=True) – Whether to L2-normalize ds1 and ds2 before indexing/search.

  • knn (int, default=10) – Number of nearest neighbors to retrieve per query.

  • metric (str, default='manhattan') – Distance metric for Annoy (e.g., 'manhattan', 'euclidean').

  • n_trees (int, default=10) – Number of trees in the Annoy index (trade-off between speed/accuracy).

  • include_distances (bool, default=False) – If True, also return distances.

Returns:

If include_distances is False, returns indices array of shape (N1, knn). Otherwise returns (indices, distances) with the same shape.

Return type:

np.ndarray or tuple of (np.ndarray, np.ndarray)