Advertisement

Multimedia Tools and Applications

, Volume 76, Issue 2, pp 2441–2466 | Cite as

Accelerated Manhattan hashing via bit-remapping with location information

  • Wenshuo Chen
  • Guiguang Ding
  • Zijia Lin
  • Jisheng Pei
Article

Abstract

Hashing is a binary-code encoding method which tries to preserve the neighborhood structures in the original feature space, in order to realize efficient approximate nearest neighbor search in large-scale databases. Existing hashing methods usually adopt a two-stage strategy (projection stage and quantization stage) to encode data points, and threshold-based single-bit quantization (SBQ) is used to binarize each projected dimension into 0 or 1. Data similarity between hash codes is measured by their Hamming distance. However, SBQ may destroy the original neighborhood structures by quantizing neighboring points near threshold into different binary values. Double-bit quantization (DBQ) and its derivative, Manhattan hashing, have been proposed to fix this problem. Experimental results showed that Manhattan hashing outperformed state-of-the-art methods in terms of effectiveness, but lost the advantage of efficiency because it used decimal arithmetic instead of fast bitwise operations for similarity measurement between hash codes. In this paper, we propose an accelerated strategy of Manhattan hashing by making full use of bitwise operations. Our main contributions are: 1) a new encoding method which assigns location information to each binary digit is proposed to avoid the time-consuming decimal arithmetic; 2) a novel hash code distance measurement that accelerates the calculation of Manhattan distance is proposed to improve query efficiency. Extensive experiments on three benchmark datasets show that our approach improves the speed of data querying on 2-bit, 3-bit and 4-bit quantized hash codes by at least one order of magnitude on average, without any precision loss.

Keywords

Accelerated Manhattan hashing Bit-remapping Multiple-bit quantization Manhattan distance 

Notes

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant No.61271394 and 61571269). The authors would like to thank the anonymous reviewers for their valuable comments.

References

  1. 1.
    Andoni A, Indyk P (2008) Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: Communications of the ACM - 50th anniversary issue: 1958–2008, vol 51Google Scholar
  2. 2.
    Baluja S, Covell M (2008) Learning to hash: forgiving hash functions and applications. Data Min Knowl Disc 17(3)Google Scholar
  3. 3.
    Cheng W, Jin X, Sun J-T, Lin X, Zhang X, Wang W (2014) Searching dimension incomplete databases. Knowl Data Eng 26(3)Google Scholar
  4. 4.
    Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Computer vision and pattern recognitionGoogle Scholar
  5. 5.
    Friedman JH, Bentley JL, Finkel RA (1977) An algorithm for finding best matches in logarithmic expected time. ACM Trans Math Softw 3(3)Google Scholar
  6. 6.
    Gionis A, Indyk P, Motwani R, et al. (1999) Similarity search in high dimensions via hashing. In: Very large data bases, vol 99Google Scholar
  7. 7.
    Gong Y, Lazebnik S (2011) Iterative quantization: a procrustean approach to learning binary codes. In: Computer vision and pattern recognitionGoogle Scholar
  8. 8.
    Guttman A (1984) R-trees: a dynamic index structure for spatial searching 14(2)Google Scholar
  9. 9.
    Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the 13th annual ACM symposium on theory of computingGoogle Scholar
  10. 10.
    Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. In: European conference on computer visionGoogle Scholar
  11. 11.
    Jégou H, Douze M, Schmid C (2010) Improving bag-of-features for large scale image search. Int J Comput Vis 87(3)Google Scholar
  12. 12.
    Jegou H, Douze M, Schmid C (2011) Product quantization for nearest neighbor search. Pattern Analysis and Machine Intelligence 33(1)Google Scholar
  13. 13.
    Jolliffe I (2002) Principal component analysisGoogle Scholar
  14. 14.
    Kong W, Li W-J (2012) Double-bit quantization for hashing. In: Association for the advancement of artificial intelligenceGoogle Scholar
  15. 15.
    Kong W, Li W-J, Guo M (2012) Manhattan hashing for large-scale image retrieval. In: ACM special interest group on information retrievalGoogle Scholar
  16. 16.
    Lee Y, Heo J-P, Yoon S-E (2014) Quadra-embedding: binary code embedding with low quantization error. Comput Vis Image Underst 125Google Scholar
  17. 17.
    Lin Z, Ding G, Hu M (2014) Image auto-annotation via tag-dependent random search over range-constrained visual neighbours. Multimedia tools and applicationsGoogle Scholar
  18. 18.
    Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: Computer vision and pattern recognitionGoogle Scholar
  19. 19.
    Liu W, Wang J, Kumar S, Chang S-F (2011) Hashing with graphs. In: Proceedings of the 28th international conference on machine learningGoogle Scholar
  20. 20.
    Moran S, Lavrenko V, Osborne M (2013) Neighbourhood preserving quantisation for lsh. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrievalGoogle Scholar
  21. 21.
    Moran S, Lavrenko V, Osborne M (2013) Variable bit quantisation for lsh. In: Association for computational linguisticsGoogle Scholar
  22. 22.
    Mu Y, Shen J, Yan S (2010) Weakly-supervised hashing in kernel space. In: Computer vision and pattern recognitionGoogle Scholar
  23. 23.
    Norouzi M, Blei DM (2011) Minimal loss hashing for compact binary codes. In: International conference on machine learningGoogle Scholar
  24. 24.
    Raginsky M, Lazebnik S (2009) Locality-sensitive binary codes from shift-invariant kernels. In: Advances in neural information processing systemsGoogle Scholar
  25. 25.
    Song J, Yang Y, Huang Z, Shen HT, Hong R (2011) Multiple feature hashing for real-time large scale near-duplicate video retrieval. In: Proceedings of the 19th ACM international conference on multimediaGoogle Scholar
  26. 26.
    Uhlmann JK (1991) Satisfying general proximity/similarity queries with metric trees. Inf Process Lett 40(4)Google Scholar
  27. 27.
    Wang J, Kumar S, Chang SF (2010) Semi-supervised hashing for scalable image retrieval. In: Computer vision and pattern recognitionGoogle Scholar
  28. 28.
    Wang X, Jin X, Chen M-E, Zhang K, Shen D (2012) Topic mining over asynchronous text sequences. Knowl Data Eng 24(1)Google Scholar
  29. 29.
    Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systemsGoogle Scholar
  30. 30.
    Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2(1)Google Scholar
  31. 31.
    Yu Z, Wu F, Yang Y, Tian Q, Luo J, Zhuang Y (2014) Discriminative coupled dictionary hashing for fast cross-media retrieval. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrievalGoogle Scholar
  32. 32.
    Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: Proceedings of the 37th international ACM SIGIR conference on research and development in information retrievalGoogle Scholar
  33. 33.
    Zhu X, Huang Z, Cheng H, Cui J, Shen HT (2013) Sparse hashing for fast multimedia search. ACM Trans Inf Syst 31(2)Google Scholar
  34. 34.
    Zhu X, Huang Z, Shen HT, Zhao X (2013) Linear cross-modal hashing for efficient multimedia search. In: Proceedings of the 21st ACM international conference on multimediaGoogle Scholar
  35. 35.
    Zhu X, Zhang L, Huang Z (2014) A sparse embedding and least variance encoding approach to hashing. Image Processing 23(9)Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Wenshuo Chen
    • 1
  • Guiguang Ding
    • 1
  • Zijia Lin
    • 2
  • Jisheng Pei
    • 2
  1. 1.School of SoftwareTsinghua UniversityBeijingChina
  2. 2.Department of Computer Science and TechnologyTsinghua UniversityBeijingChina

Personalised recommendations