Skip to main content

Fast Approximate k-Nearest Neighbours Search Using GPGPU

  • Chapter
  • First Online:
GPU Computing and Applications

Abstract

The k-nearest neighbours (k-NN) search is one of the most critical non-parametric methods used in data retrieval and similarity tasks. Over recent years, fast k-NN processing for large amount of high-dimensional data is increasingly demanded. Locality-sensitive hashing is a viable solution for computing fast approximate nearest neighbours (ANN) with reasonable accuracy. This chapter presents a novel parallelisation of the locality-sensitive hashing method using GPGPU, where the multi-probe variant is considered. The method was implemented using CUDA platform for constructing a k-ANN graph. It was compared to the state-of-the-art CPU-based k-ANN and two GPU-based k-NN methods on large and multidimensional data set. The experimental results showed that the proposed method has a speed-up of 30× or higher, in comparison to the CPU-based approximate method, whilst retaining a high recall rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)

    Article  MATH  Google Scholar 

  2. Samet, H.: Foundations of MultiDimensional and Metric Data Structures. Morgan Kaufmann, San Francisco, CA (2006)

    MATH  Google Scholar 

  3. Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful?. In: Proceedings Database Theory (ICDT’99), pp. 217–235 (1999)

    Google Scholar 

  4. Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pp. 459–468 (2006)

    Google Scholar 

  5. Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 380–388 (2002)

    Google Scholar 

  6. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in neural information processing systems (NIPS’08), pp. 1753–1760 (2008)

    Google Scholar 

  7. Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604–613 (1998)

    Google Scholar 

  8. Slaney, M., Lifshits, Y., He, J.: Optimal parameters for locality-sensitive hashing. Proc. IEEE 100(9):2604–2623 (2012)

    Google Scholar 

  9. NVIDIA Corporation.: NVIDIA CUDA C Programming Guide (Version 4.2). http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html (2013)

  10. Gao, Y., Chen, L., Chen, G., Chen, C. Efficient parallel processing for K-nearest-neighbor search in spatial databases. In: Computational Science and Its Applications-(ICCSA 2006), pp. 39–48 (2006)

    Google Scholar 

  11. Aparício, G., Blanquer, I., Hernández, V.: A parallel implementation of the k nearest neighbours classifier in three levels: Threads, MPI processes and the grid. In: High Performance Computing for Computational Science (VECPAR 2006) pp. 225–235 (2007)

    Google Scholar 

  12. Zhang, C., Li, F., Jestes, J.: Efficient parallel kNN joins for large data in MapReduce. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 38–49 (2012)

    Google Scholar 

  13. Garcia, V., Debreuve, E., Barlaud, M.: Fast k nearest neighbor search using GPU. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’08), pp. 1–6 (2008)

    Google Scholar 

  14. Liang, S., Wang, C., Liu, Y., Jian, L.: CUKNN: a parallel implementation of K-nearest neighbor on CUDA-enabled GPU. In: Proceedings of the IEEE Youth Conference on Information, Computing and Telecommunication (YC-ICT’09), pp. 415–418 (2009)

    Google Scholar 

  15. Leite, P., Teixeira, J.M.X.N., de Farias, T.S.M.C., Teichrieb, V., Kelner, J.: Massively parallel nearest neighbor queries for dynamic point clouds on the GPU. In: Proceedings of the IEEE 21st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD’09), pp. 19–25, (2009).

    Google Scholar 

  16. Yeh, T.T., Chen, T.Y., Chen, Y.C., Shih, W.K.: Efficient parallel algorithm for nonlinear dimensionality reduction on GPU. In: Proceedings of the 2010 I.E. International Conference on Granular Computing, pp. 592–597 (2010)

    Google Scholar 

  17. Garcia, V., Debreuve, E., Nielsen, F., Barlaud, M. K-nearest neighbor search: Fast GPU-based implementations and application to high-dimensional feature matching. In: Proceedings of the 17th IEEE International Conference on Image Processing, pp. 3757–3760 (2010)

    Google Scholar 

  18. Barrientos, R.J., Gómez, J.I., Tenllado, C., Matias, M.P., Marin, M.: kNN query processing in metric spaces using GPUs. In: Euro-Par 2011 Parallel Processing, pp. 380–392 (2011)

    Google Scholar 

  19. Arefin, A.S., Riveros, C., Berretta, R., Moscato, P.: GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs. PloS One 7(8), e44000 (2012)

    Article  Google Scholar 

  20. Sismanis, N., Pitsianis, N., Sun, X.: Parallel search of k-nearest neighbors with synchronous operations. In: Proceedings of the IEEE Conference on High Performance Extreme Computing, pp. 1–6 (2012)

    Google Scholar 

  21. Haghani, P., Michel, S., Cudré-Mauroux, P., Aberer, K.: LSH at large-distributed KNN search in high dimensions. In: Proceedings of the 11th International Workshop on Web and Database (WebDB’08) (2008)

    Google Scholar 

  22. Haghani, P., Michel, S., Aberer, K.: Distributed similarity search in high dimensions using locality sensitive hashing. In: Proceedings of the ACM 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 744–755 (2009)

    Google Scholar 

  23. Pan, J., Lauterbach, C., Manocha, D.: Efficient nearest-neighbor computation for GPU-based motion planning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2243–2248 (2010)

    Google Scholar 

  24. Pan, J., Manocha, D.: Fast GPU-based locality sensitive hashing for k-nearest neighbor computation. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 211–220 (2011)

    Google Scholar 

  25. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the ACM twentieth annual symposium on Computational geometry, pp. 253–262 (2004)

    Google Scholar 

  26. Slaney, M., Casey, M.: Locality-sensitive hashing for finding nearest neighbors [lecture notes]. EEE Signal Process. Mag. 25(2), 128–131 (2008)

    Article  Google Scholar 

  27. Andoni, A., Indyk, P.: E2LSH 0.1 user manual (2005)

    Google Scholar 

  28. Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd international conference on Very large data bases, pp. 950–961 (2007)

    Google Scholar 

  29. Merrill, D., Grimshaw, A.: High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing. Parallel Processing Letters 21(2):245–272 (2011)

    Google Scholar 

  30. Shams, R., Kennedy, R.A.: Efficient histogram algorithms for NVIDIA CUDA compatible devices. Proceedings of the Int. Conf. on Signal Processing and Communications Systems (ICSPCS) 418–422 (2007)

    Google Scholar 

  31. Munro, J.I., Papadakis, T., Sedgewick, R.: Deterministic skip lists. In: Proceedings of the 3rd ACM-SIAM Symposium Discrete Algorithms, pp. 367–375 (1992)

    Google Scholar 

  32. Hervé, J., Matthijs, D., Cordelia, S.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)

    Article  Google Scholar 

  33. NVIDIA Corporation.: NVIDIA Visual Profiler, Version 5.0 (2013).

    Google Scholar 

  34. Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling LSH for performance tuning. In the Proceedings of the 17th Conference on Information and Knowledge Management, pp. 669–678 (2008)

    Google Scholar 

  35. Qiu, D., May, S., Nüchter, A.: GPU-accelerated nearest neighbor search for 3D registration. Lecture Notes in Computer Science 5815:194–203 (2009)

    Google Scholar 

Download references

Acknowledgements

Thanks to the TEXMEX Research Team for publicly providing the SIFT data set for research purposes. This work was supported by Slovenian Research Agency under research contracts 1000-13-0552, P2-0041, and J2-5479.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Singapore

About this chapter

Cite this chapter

Lukač, N., Žalik, B. (2015). Fast Approximate k-Nearest Neighbours Search Using GPGPU. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-287-134-3_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-287-133-6

  • Online ISBN: 978-981-287-134-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics