Advertisement

Navigating K-Nearest Neighbor Graphs to Solve Nearest Neighbor Searches

  • Edgar Chávez
  • Eric Sadit Tellez
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6256)

Abstract

Nearest neighbor queries can be satisfied, in principle, with a greedy algorithm under a proximity graph. Each object in the database is represented by a node, and proximal nodes in this graph will share an edge. To find the nearest neighbor the idea is quite simple, we start in a random node and get iteratively closer to the nearest neighbor following only adjacent edges in the proximity graph. Every reachable node from current vertex is reviewed, and only the closer-to-the-query node is expanded in the next round. The algorithm stops when none of the neighbors of the current node is closer to the query. The number of revised objects will be proportional to the diameter of the graph times the average degree of the nodes. Unfortunately the degree of a proximity graph is unbounded for a general metric space [1], and hence the number of inspected objects can be linear on the size of the database, which is the same as no indexing at all.

In this paper we introduce a quasi-proximity graph induced by the all-k-nearest neighbor graph. The degree of the above graph is bounded but we will face local minima when running the above greedy algorithm, which boils down to have false positives in the queries.

We show experimental results for high dimensional spaces. We report a recall greater than 90% for most configurations, which is very good for many proximity searching applications, reviewing just a tiny portion of the database.

The space requirement for the index is linear on the database size, and the construction time is quadratic in worst case. Relaxations of our method are sketched to obtain practical subquadratic implementations.

References

  1. 1.
    Navarro, G.: Searching in metric spaces by spatial approximation. The VLDB Journal 11(1), 28–46 (2002)CrossRefGoogle Scholar
  2. 2.
    Samet, H.: Foundations of Multidimensional and Metric Data Structures. Morgan Kaufmann Publishers, San Francisco (2006)zbMATHGoogle Scholar
  3. 3.
    Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comput. Surv. 33(3), 273–321 (2001)CrossRefGoogle Scholar
  4. 4.
    Hjaltason, G.R., Samet, H.: Index-driven similarity search in metric spaces (survey article). ACM Trans. Database Syst. 28(4), 517–580 (2003)CrossRefGoogle Scholar
  5. 5.
    Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity search: The metric space approach. Springer, New York (2006)zbMATHGoogle Scholar
  6. 6.
    Patella, M., Ciaccia, P.: Approximate similarity search: A multi-faceted problem. Journal of Discrete Algorithms 7(1), 36–48 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Aurenhammer, F.: Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Computing Surveys (CSUR) 23(3), 405 (1991)CrossRefGoogle Scholar
  8. 8.
    Paredes, R., Chávez, E.: Using the k-nearest neighbor graph for proximity searching in metric spaces. In: Consens, M.P., Navarro, G. (eds.) SPIRE 2005. LNCS, vol. 3772, pp. 127–138. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  9. 9.
    Brito, M., Chavez, E., Quiroz, A., Yukich, J.: Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Statistics & Probability Letters 35(1), 33–42 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Vidal, E.: New formulation and improvements of the Nearest-Neighbour approximating and eliminating search algorithm(AESA). Pattern Recognition Letters 15(1), 1–7 (1994)CrossRefGoogle Scholar
  11. 11.
    Sebastian, T., Kimia, B.: Metric-based shape retrieval in large databases. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 3 (2002)Google Scholar
  12. 12.
    Burkhard, W.A., Keller, R.M.: Some approaches to best-match file searching. Communications of the ACM 16(4), 230–237 (1973)CrossRefzbMATHGoogle Scholar
  13. 13.
    Baeza-Yates, R.A., Ribeiro-Neto, B.A.: Modern Information Retrieval. ACM Press/Addison-Wesley (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Edgar Chávez
    • 1
    • 2
  • Eric Sadit Tellez
    • 1
  1. 1.Universidad MichoacanaMéxico
  2. 2.CICESEMéxico

Personalised recommendations