An approximate nearest neighbours search algorithm based on the Extended General Spacefilling Curves Heuristic

  • Juan-Carlos Pérez
  • Enrique Vidal
Statistical Pattern Recognition
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1451)

Abstract

In this paper, an algorithm for approximate nearest neighbours search in vector spaces is proposed. It is based on the Extended General Spacefilling Curves Heuristic (EGSH). Under this general scheme, a number of mappings are established between a region of a multidimensional real vector space and an interval of the real line, and then for each mapping the problem is solved in one dimension. To this end, the real values that represent the prototypes are stored in several ordered data structures (e.g. b-trees). The nearest neighbours of a test point are then efficiently searched in each structure and placed into a set of candidate neighbours. Finally, the distance from each candidate to the test point is measured in the original multidimensional space, and the nearest one(s) are chosen.

Keywords

Test Point Uniform Random Distribution Exhaustive Search Method Neighbour Search Algorithm Temporal Cost 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. Bartholdi, J.J.; Platzman, L.K. (1983) “A Fast Heuristic Based on Spacefilling Curves for Minimum-Weight Matching in the Plane“, Information Processing Letters, 17. pp. 177–180.CrossRefGoogle Scholar
  2. Bartholdi, J.J.; Platzman, L.K. (1988) “Heuristics Based on Spacefilling Curves for Combinatorial Problems in Euclidean Space”, Management Science, 34. pp. 291–305.Google Scholar
  3. Bentley, J.L.; Weide, B.W.; Yao, A.C. (1980). “Optimal Expected Time Algorithms for Closest Point Problems”, ACM Transactions on Mathematical Software, Vol. 6, pp. 563–580.CrossRefGoogle Scholar
  4. Bern, M. (1993). “Approximate Closest-Point Queries in High Dimensions”, Pattern Recognition, Vol. 45, pp. 95–99.Google Scholar
  5. Fukunaga, K.; Narendra, P.M. (1975). “A Branch and Bound Algorithm for Computing k-Nearest Neighbors”, IEEE Transactions on Computers, Vol. 24, No. 7, pp. 750–753.Google Scholar
  6. Friedman, J.H.; Baskett, F.; Shustek, L.J. (1975). “An Algorithm for Finding Nearest Neighbors”, IEEE Tr. on Computers, Vol. 24, No. 10, pp. 1000–1006.Google Scholar
  7. Friedman, J.H.; Bentley, J.L.; Finkel, R.A. (1977). “An Algorithm for Finding Best Matches in Logarithmic Expected Time”, ACM Transactions on Mathematical Software, Vol. 3, No. 3, pp. 209–226.CrossRefGoogle Scholar
  8. Fukunaga, K. (1990). “Introduction to Statistical Pattern Recognition”, Academic Press, San Diego, CA.Google Scholar
  9. Hilbert, D. (1891). “Ueber die steitge Abbildung einer Linie auf ein Flaechenstueck”, Math. Ann, Vol. 38, pp. 459–460.CrossRefGoogle Scholar
  10. Imai, H. (1986). “Worst-Case Analysis for Planar Matching and Tour Heuristics with Bucketing Techniques and Spacefilling Curves”, Journal of the Operations Research Society of Japan, Vol. 29, No. 1, pp. 43–67.Google Scholar
  11. Jain, A.K.; Dubes, R.C. (1988). “Algorithms for Clustering Data”, Prentice Hall.Google Scholar
  12. Kalantari, I.; McDonald, G. (1983). “A Data Structure and an Algorithm for the Nearest Point Problem”, IEEE Trans. on Software Engineering, Vol. 9, No. 5, pp. 631–634.Google Scholar
  13. Kim, B.S.; Park, S.B. (1986). “A Fast k-Nearest Neighbor Finding Algorithm Based on the Ordered Partition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 8, No. 6, pp. 761–766.Google Scholar
  14. Miclet, L.; Dabouz, M. (1983). “Approximative Fast Nearest Neighbor Recognition”, Pattern Recognition Letters, Vol. 1, No. 5/6, pp. 277–285.CrossRefGoogle Scholar
  15. Murphy, O.J.; Selkow, S.M. (1990). “Finding Nearest Neighbors with Voronoi Tessellations”, Information Processing Letters, Vol. 34, pp. 37–41.CrossRefGoogle Scholar
  16. Omohundro, S.M. (1990). “Geometric Learning Algorithms”, Physica D, 42. pp.307–321.CrossRefGoogle Scholar
  17. Peano, G. (1890). “Sur une Courbe qui Remplit Toute une Aire Plane”, Math. Ann., Vol, 36, pp. 157–160.CrossRefMathSciNetGoogle Scholar
  18. Pérez, J.C.; Vidal, E. (1994). “Métodos Geométricos de Aprendizaje Supervisado”,.Ph.D. Thesis (In spanish) DSIC. Univ. Politécnica de Valencia.Google Scholar
  19. Pérez, J.C.; Vidal, E. (1997). “The Extended General Spacefilling Curves Heuristic” Technical Report, Dept. DISCA. Universidad Politécnica de Valencia. http://www.disca.upv.esGoogle Scholar
  20. Pérez, J.C.; Vidal, E. (1998). “The Extended General Spacefilling Curves Heuristic” Submitted to ICPR-98.Google Scholar
  21. Poggio, T.; Girosi, F. (1990). “Networks for Approximation and Learning”, Proceedings of the IEEE, Vol.78, no.9. pp. 1481–1497.CrossRefGoogle Scholar
  22. Sethi, LK. (1981). “A Fast Algorithm for Recognizing Nearest Neighbors”, IEEE Transactions on Systems, Man and Cybernetics, Vol. 11, No. 3, pp. 245–248.Google Scholar
  23. Shasha, D.; Wang, T. (1990). “New Techniques for Best-Match Retrieval”, ACM Transactions on Information Systems, Vol. 8, No. 2, pp. 140–158.CrossRefGoogle Scholar
  24. Sierpinski, M.W. (1912). “Sur une Nouvelle Courbe Continue qui Remplit Toute une Aire Plane”, Bull. Acad. Sci. de Cracovie, pp. 462–478.Google Scholar
  25. Skubalska, E.; Krzyzak, A. (1996). “Fast k-NN Classification Rule Using Metric on Space-Filling Curves”, Proceedings of ICPR-96, pp. 121–125.Google Scholar
  26. Vidal, E. (1986). “An Algorithm for Finding Nearest Neighbours in (Approximately) Constant Average Time”, Pattern Recognition Letters, Vol. 4, pp. 333–344.Google Scholar
  27. Yao, A.C.; Yao, F.F. (1985). “A General Approach to d-dimensional Geometric Queries”, Proceedings of the 17th Annual ACM Symposium on the Theory of Computing, 163-168.Google Scholar
  28. Yunck, T.P. (1976). “A Technique to Identify Nearest Neighbors”, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 6, No. 10, pp. 678–683.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Juan-Carlos Pérez
    • 1
  • Enrique Vidal
    • 2
  1. 1.Dpt. de Informática de Sistemas y ComputAdores (DISCA)Universidad Politécnica de ValenciaC. de Vera s/nSpain
  2. 2.Dpt. de Sistemas Informáticos y Computación (DSIC)Universidad Politécnica de ValenciaC. de Vera s/nSpain

Personalised recommendations