A Julia implementation of Nearest Neighbor Descent.
Dong, Wei et al. Efficient K-Nearest Neighbor Graph Construction for Generic Similarity Measures. WWW (2011).
Nearest Neighbor Descent (NNDescent) is an approximate K-nearest neighbor graph construction algorithm that has several useful properties:
NNDescent is based on the heuristic argument that a neighbor of a neighbor is also likely to be a neighbor. That is, given a list of approximate nearest neighbors to a point, we can improve that list by exploring the neighbors of each point in the list. The algorithm is in essence the repeated application of this principle.
Approximate kNN graph construction on a dataset:
using NearestNeighborDescent using Distances data = [rand(20) for _ in 1:1000] n_neighbors = 10 metric = Euclidean() graph = nndescent(data, n_neighbors, metric)
The approximate KNNs of the original dataset can be retrieved from the resulting graph with
# return the approximate knns as KxN matrices of indexes and distances, where # indices[j, i] and distances[j, i] are the index of and distance to node i's jth # nearest neighbor, respectively. indices, distances = knn_matrices(graph)
To find the approximate neighbors for new points with respect to an already constructed graph:
queries = [rand(20) for _ in 1:20] n_neighbors = 5 indices, distances = search(graph, queries, n_neighbors)
3 days ago