spamosaic.MNN.nn_approx
- spamosaic.MNN.nn_approx(ds1, ds2, names1, names2, knn=50)[source]
Approximate nearest-neighbor search using HNSW (hnswlib).
- Parameters:
ds1 (np.ndarray) – Query dataset of shape
(N1, D).ds2 (np.ndarray) – Reference dataset of shape
(N2, D).names1 (list of str) – Identifiers for rows in
ds1.names2 (list of str) – Identifiers for rows in
ds2.knn (int, default=50) – Number of nearest neighbors to find for each query.
- Returns:
Set of matched
(query_name, reference_name)pairs.- Return type:
set[tuple[str, str]]