maxfuse.graph.get_nearest_neighbors

maxfuse.graph.get_nearest_neighbors(query_arr, target_arr, svd_components=None, randomized_svd=False, svd_runs=1, metric='correlation')[source]

For each row in query_arr, compute its nearest neighbor in target_arr.

Parameters:
  • query_arr (np.array of shape (n_samples1, n_features)) – The query data matrix.

  • target_arr (np.array of shape (n_samples2, n_features)) – The target data matrix.

  • svd_components (None or int, default=None) – If not None, will first conduct SVD to reduce the dimension of the vertically stacked version of query_arr and target_arr.

  • randomized_svd (bool, default=False) – Whether to use randomized SVD.

  • svd_runs (int, default=1) – Run multiple instances of SVD and select the one with the lowest Frobenious reconstruction error.

  • metric (string, default='correlation') – The metric to use in nearest neighbor search.

Returns:

  • neighbors (np.array of shape (n_samples1)) – The i-th element is the index in target_arr to whom the i-th row of query_arr is closest to.

  • dists (np.array of shape (n_samples1)) – The i-th element is the distance corresponding to neighbors[i].