Advertisement

An Efficient Hashing Algorithm for NN Problem in HD Spaces

  • Faraj AlhwarinEmail author
  • Alexander Ferrein
  • Ingrid Scholl
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11351)

Abstract

Nearest neighbor (NN) search is a fundamental issue in many computer applications, such as multimedia search, computer vision and machine learning. While this problem is trivial in low-dimensional search spaces, it becomes much more difficult in higher dimension because of the phenomenon known as the curse of dimensionality, where the complexity grows exponentially with dimension and the data tends to show strong correlations between dimensions. In this paper, we introduce a new hashing method to efficiently cope with this challenge. The idea is to split the search space into many subspaces based on a number of jointly-independent and uniformly-distributed circular random variables (CRVs) computed from the data points. Our method has been tested on datasets of local SIFT and global GIST features and was compared to locality sensitive hashing (LSH), Spherical Hashing methods (HD and SHD) and the fast library for approximate nearest neighbor (FLANN) matcher by using linear search as a baseline. The experimental results show that our method outperforms all state-of-the-art methods for the GIST features. For SIFT features, the results indicate that our method significantly reduces the search query time while preserving the search quality and outperforms FLANN for datasets of size less than 200 K points.

Keywords

Feature matching Hash trees NN search Curse of dimensionality 

References

  1. 1.
    Muja, M., Lowe, D.: Fast approximate nearest neighbors with automatic algorithm configuration. In: Proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP 2009), pp. 331–340 (2009)Google Scholar
  2. 2.
    Sebastian, B., Kimia, B.B.: Metric-based shape retrieval in large databases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2002), vol. 3, pp. 291–296 (2002)Google Scholar
  3. 3.
    Hajebi, K., Abbasi-Yadkori, Y., Shahbazi, H., Zhang, H.: Fast approximate nearest-neighbor search with k-nearest neighbor graph. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 1312–1317 (2011)Google Scholar
  4. 4.
    Bentley, L.: Multidimensional binary search trees used for associative searching. Commun. ACM (CACM) 18(9), 509–517 (1975)CrossRefGoogle Scholar
  5. 5.
    Friedman, J.H., Bentley, J.L., Finkel, R.A.: An algorithm for finding best matches in logarithmic expected time. ACM Trans. Math. Softw. 3(3), 209–226 (1977)CrossRefGoogle Scholar
  6. 6.
    Beis, J.S., Lowe, D.G.: Shape indexing using approximate nearest-neighbour search in high-dimensional spaces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 1997), pp. 1000–1006 (1997)Google Scholar
  7. 7.
    Arya, S., Mount, D.M., Netanyahu, N.S., Silverman, R., Wu, A.Y.: An optimal algorithm for approximate nearest neighbor searching in fixed dimensions. J. ACM 45(6), 891–923 (1998)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Silpa-Anan, C., Hartley, R.: Optimised KD-trees for fast image descriptor matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2008), pp. 1–8 (2008)Google Scholar
  9. 9.
    Fukunaga, K., Narendra, P.M.: A branch and bound algorithm for computing k-nearest neighbors. IEEE Trans. Comput. (TC) C–24(7), 750–753 (1975)CrossRefGoogle Scholar
  10. 10.
    Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the Symposium on Computational Geometry (SoCG 1998), pp. 604–613 (1998)Google Scholar
  11. 11.
    Datar, M., Indyk, P., Immorlica, N., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the Thirtieth Annual ACM Symposium on Theory of computing (STOC 2004), pp. 253–262 (2004)Google Scholar
  12. 12.
    Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 51(1), 117–122 (2008)CrossRefGoogle Scholar
  13. 13.
    Kulis, B., Grauman, K.: Kernelized locality-sensitive hashing for scalable image search. In: Proceedings of the IEEE 12th International Conference on Computer Vision (ICCV 2009), pp. 2130–2137 (2009)Google Scholar
  14. 14.
    Wang, J., Kumar, S., Chang, S.F.: Semi-supervised hashing for scalable image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2010), pp. 3424–3431 (2010)Google Scholar
  15. 15.
    Bawa, M., Condie, T., Ganesan, P.: LSH forest: self-tuning indexes for similarity search. In: Proceedings of the International World Wide Web Conference (WWW 2005), pp. 651–660 (2005)Google Scholar
  16. 16.
    Gong, Y., Lazebnik, S.: Iterative quantization: a procrustean approach to learning binary codes. In: CVPR (2011)Google Scholar
  17. 17.
    He, J., Radhakrishnan, R., Chang, S.-F., Bauer, C.: Compact hashing with joint optimization of search accuracy and time. In: CVPR (2011)Google Scholar
  18. 18.
    Joly, A., Buissonr, O.: Random maximum margin hashing. In: CVPR (2011)Google Scholar
  19. 19.
    Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS (2008)Google Scholar
  20. 20.
    Heo, J.P., Lee, Y., He, J., Chang, S.F., Yoon, S.E.: Spherical hashing: binary code embedding with hyperspheres. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2304–2316 (2015)CrossRefGoogle Scholar
  21. 21.
    Alhwarin, F., Ferrein, A., Scholl, I.: CRVM: circular random variable-based matcher, a novel hashing method for fast NN search in high-dimensional spaces. In: Proceedings of the 7th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2018), pp. 214–221 (2018)Google Scholar
  22. 22.
    Fisher, N.I., Lee, A.: A correlation coefficient for circular data. Biometrika 70(2), 327–332 (1983)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2007) (2007)Google Scholar
  24. 24.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Faraj Alhwarin
    • 1
    Email author
  • Alexander Ferrein
    • 1
  • Ingrid Scholl
    • 1
  1. 1.Mobile Autonomous Systems & Cognitive Robotics InstituteFH Aachen University of Applied SciencesAachenGermany

Personalised recommendations