Abstract
The k-nearest neighbours (k-NN) search is one of the most critical non-parametric methods used in data retrieval and similarity tasks. Over recent years, fast k-NN processing for large amount of high-dimensional data is increasingly demanded. Locality-sensitive hashing is a viable solution for computing fast approximate nearest neighbours (ANN) with reasonable accuracy. This chapter presents a novel parallelisation of the locality-sensitive hashing method using GPGPU, where the multi-probe variant is considered. The method was implemented using CUDA platform for constructing a k-ANN graph. It was compared to the state-of-the-art CPU-based k-ANN and two GPU-based k-NN methods on large and multidimensional data set. The experimental results showed that the proposed method has a speed-up of 30× or higher, in comparison to the CPU-based approximate method, whilst retaining a high recall rate.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. IEEE Trans. Inf. Theory 13(1), 21–27 (1967)
Samet, H.: Foundations of MultiDimensional and Metric Data Structures. Morgan Kaufmann, San Francisco, CA (2006)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful?. In: Proceedings Database Theory (ICDT’99), pp. 217–235 (1999)
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Proceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science, pp. 459–468 (2006)
Charikar, M.S.: Similarity estimation techniques from rounding algorithms. In: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 380–388 (2002)
Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: Advances in neural information processing systems (NIPS’08), pp. 1753–1760 (2008)
Indyk, P., Motwani, R.: Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604–613 (1998)
Slaney, M., Lifshits, Y., He, J.: Optimal parameters for locality-sensitive hashing. Proc. IEEE 100(9):2604–2623 (2012)
NVIDIA Corporation.: NVIDIA CUDA C Programming Guide (Version 4.2). http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html (2013)
Gao, Y., Chen, L., Chen, G., Chen, C. Efficient parallel processing for K-nearest-neighbor search in spatial databases. In: Computational Science and Its Applications-(ICCSA 2006), pp. 39–48 (2006)
Aparício, G., Blanquer, I., Hernández, V.: A parallel implementation of the k nearest neighbours classifier in three levels: Threads, MPI processes and the grid. In: High Performance Computing for Computational Science (VECPAR 2006) pp. 225–235 (2007)
Zhang, C., Li, F., Jestes, J.: Efficient parallel kNN joins for large data in MapReduce. In: Proceedings of the 15th International Conference on Extending Database Technology, pp. 38–49 (2012)
Garcia, V., Debreuve, E., Barlaud, M.: Fast k nearest neighbor search using GPU. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’08), pp. 1–6 (2008)
Liang, S., Wang, C., Liu, Y., Jian, L.: CUKNN: a parallel implementation of K-nearest neighbor on CUDA-enabled GPU. In: Proceedings of the IEEE Youth Conference on Information, Computing and Telecommunication (YC-ICT’09), pp. 415–418 (2009)
Leite, P., Teixeira, J.M.X.N., de Farias, T.S.M.C., Teichrieb, V., Kelner, J.: Massively parallel nearest neighbor queries for dynamic point clouds on the GPU. In: Proceedings of the IEEE 21st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD’09), pp. 19–25, (2009).
Yeh, T.T., Chen, T.Y., Chen, Y.C., Shih, W.K.: Efficient parallel algorithm for nonlinear dimensionality reduction on GPU. In: Proceedings of the 2010 I.E. International Conference on Granular Computing, pp. 592–597 (2010)
Garcia, V., Debreuve, E., Nielsen, F., Barlaud, M. K-nearest neighbor search: Fast GPU-based implementations and application to high-dimensional feature matching. In: Proceedings of the 17th IEEE International Conference on Image Processing, pp. 3757–3760 (2010)
Barrientos, R.J., Gómez, J.I., Tenllado, C., Matias, M.P., Marin, M.: kNN query processing in metric spaces using GPUs. In: Euro-Par 2011 Parallel Processing, pp. 380–392 (2011)
Arefin, A.S., Riveros, C., Berretta, R., Moscato, P.: GPU-FS-kNN: A Software Tool for Fast and Scalable kNN Computation Using GPUs. PloS One 7(8), e44000 (2012)
Sismanis, N., Pitsianis, N., Sun, X.: Parallel search of k-nearest neighbors with synchronous operations. In: Proceedings of the IEEE Conference on High Performance Extreme Computing, pp. 1–6 (2012)
Haghani, P., Michel, S., Cudré-Mauroux, P., Aberer, K.: LSH at large-distributed KNN search in high dimensions. In: Proceedings of the 11th International Workshop on Web and Database (WebDB’08) (2008)
Haghani, P., Michel, S., Aberer, K.: Distributed similarity search in high dimensions using locality sensitive hashing. In: Proceedings of the ACM 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 744–755 (2009)
Pan, J., Lauterbach, C., Manocha, D.: Efficient nearest-neighbor computation for GPU-based motion planning. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2243–2248 (2010)
Pan, J., Manocha, D.: Fast GPU-based locality sensitive hashing for k-nearest neighbor computation. In: Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp. 211–220 (2011)
Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Proceedings of the ACM twentieth annual symposium on Computational geometry, pp. 253–262 (2004)
Slaney, M., Casey, M.: Locality-sensitive hashing for finding nearest neighbors [lecture notes]. EEE Signal Process. Mag. 25(2), 128–131 (2008)
Andoni, A., Indyk, P.: E2LSH 0.1 user manual (2005)
Lv, Q., Josephson, W., Wang, Z., Charikar, M., Li, K.: Multi-probe LSH: efficient indexing for high-dimensional similarity search. In: Proceedings of the 33rd international conference on Very large data bases, pp. 950–961 (2007)
Merrill, D., Grimshaw, A.: High Performance and Scalable Radix Sorting: A case study of implementing dynamic parallelism for GPU computing. Parallel Processing Letters 21(2):245–272 (2011)
Shams, R., Kennedy, R.A.: Efficient histogram algorithms for NVIDIA CUDA compatible devices. Proceedings of the Int. Conf. on Signal Processing and Communications Systems (ICSPCS) 418–422 (2007)
Munro, J.I., Papadakis, T., Sedgewick, R.: Deterministic skip lists. In: Proceedings of the 3rd ACM-SIAM Symposium Discrete Algorithms, pp. 367–375 (1992)
Hervé, J., Matthijs, D., Cordelia, S.: Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 117–128 (2011)
NVIDIA Corporation.: NVIDIA Visual Profiler, Version 5.0 (2013).
Dong, W., Wang, Z., Josephson, W., Charikar, M., Li, K.: Modeling LSH for performance tuning. In the Proceedings of the 17th Conference on Information and Knowledge Management, pp. 669–678 (2008)
Qiu, D., May, S., Nüchter, A.: GPU-accelerated nearest neighbor search for 3D registration. Lecture Notes in Computer Science 5815:194–203 (2009)
Acknowledgements
Thanks to the TEXMEX Research Team for publicly providing the SIFT data set for research purposes. This work was supported by Slovenian Research Agency under research contracts 1000-13-0552, P2-0041, and J2-5479.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Lukač, N., Žalik, B. (2015). Fast Approximate k-Nearest Neighbours Search Using GPGPU. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_14
Download citation
DOI: https://doi.org/10.1007/978-981-287-134-3_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-133-6
Online ISBN: 978-981-287-134-3
eBook Packages: EngineeringEngineering (R0)