Skip to main content
Log in

Efficient binary code indexing with pivot based locality sensitive clustering

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

High-dimensional indexing is fundamental in multimedia research field. Compact binary code indexing has achieved significant success in recent years for its effective approximation of high-dimensional data. However, most of existing binary code methods adopt linear scan to find near neighbors, which involve unnecessary computations and thus degrade search efficiency especially in large scale applications. To avoid searching codes that are not near neighbors with high probability, we propose a framework that index binary codes in clusters and only codes in relevant clusters are scanned. Consequently, Pivot Based Locality Sensitive Clustering (PLSC) is proposed and Density Adaptive Binary coding (DAB) method in PLSC clusters is presented. PLSC uses pivots to estimate similarities between data points and generates clusters based on the Locality Sensitive Hashing scheme. DAB adopts different binary code generation methods according to cluster densities. Experiments on open datasets show that offline indexing based on PLSC is efficient and DAB codes in PLSC clusters achieve significant improvement on search efficiency compared to the state of the art binary codes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Arya S, Mount DM, Netanyahu NS, Silverman R, Wu AY (1998) An optimal algorithm for approximate nearest neighbor searching fixed dimensions. J ACM 45(6):891–923. doi:http://doi.acm.org/10.1145/293347.293348

    Google Scholar 

  2. Brandt J (2010) Transform coding for fast approximate nearest neighbor search in high dimensions. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, 13–18 June 2010. pp 1815–1822

  3. Daoudi I, Idrissi K, Ouatik SE, Baskurt A, Aboutajdine D (2009) An efficient high-dimensional indexing method for content-based retrieval in large image databases. Imag Commun 24(10):775–790. doi:10.1016/j.image.2009.09.001

    Google Scholar 

  4. Datar M, Immorlica N, Indyk P, Mirrokni V (2004) Locality-sensitive hashing scheme based on p-stable distributions. In: In SCG’04: Proceedings of the twentieth annual symposium on Computational geometry. ACM, pp 253–262

  5. Gionis A, Indyk P, Motwani R (1999) Similarity search in high dimensions via hashing. Paper presented at the 25th International Conference on Very Large Databases (VLDB)

  6. Guttman A (1984) R-trees: a dynamic index structure for spatial searching. SIGMOD Rec 14(2):47–57. doi:10.1145/971697.602266

    Article  Google Scholar 

  7. Herve J (2011) Product quantization for nearest neighbor search. IEEE Trans Pattern Anal Machine Intell 33:117–128

    Article  Google Scholar 

  8. Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. Paper presented at the Proceedings of the thirtieth annual ACM symposium on Theory of computing, Dallas, Texas, United States

  9. Jegou H, Douze M, Schmid C (2008) Hamming embedding and weak geometric consistency for large scale image search. Computer Vision¨CECCV 2008:304–317

  10. Jun W, Kumar S, Shih-Fu C (2010) Semi-supervised hashing for scalable image retrieval. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, 13–18 June 2010 pp 3424–3431. doi:10.1109/cvpr.2010.5539994

  11. Junfeng H, Radhakrishnan R, Shih-Fu C, Bauer C (2011) Compact hashing with joint optimization of search accuracy and time. In: Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, 20–25 June 2011. pp 753–760. doi:10.1109/cvpr.2011.5995518

  12. Kuo Y-H, Chen K-T, Chiang C-H, Hsu WH (2009) Query expansion for hash-based image object retrieval. Paper presented at the Proceedings of the seventeen ACM International Conference on Multimedia, Beijing, China

  13. Lowe DG (1999) Object recognition from local scale-invariant features. In: Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, 1999, vol 1152: pp 1150–1157

  14. Lv Q, Josephson W, Wang Z, Charikar M, Li K (2007) Multi-probe LSH: efficient indexing for high-dimensional similarity search. Paper presented at the Proceedings of the 33rd international conference on Very large data bases

  15. Min K, Yang L, Wright J, Wu L, Hua X-S, Ma Y (2010) Compact projection: simple and efficient near neighbor search with practical memory requirements. Paper presented at the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition San Francisco, USA

  16. Mu Y, Sun J, Han T, Cheong L-F, Yan S (2010) Randomized locality sensitive vocabularies for bag-of-features model. Computer Vision – ECCV 2010. In: Daniilidis K, Maragos P, Paragios N (eds), vol 6313. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, pp 748–761. doi:10.1007/978-3-642-15558-1_54

  17. Oliva A, Torralba A (2001) Modeling the shape of the scene: a holistic representation of the spatial envelope. Int J Comput Vision 42(3):145–175. doi:10.1023/a:1011139631724

    Article  MATH  Google Scholar 

  18. Panigrahy R (2006) Entropy based nearest neighbor search in high dimensions. Paper presented at the Proceedings of the seventeenth annual ACM-SIAM symposium on discrete algorithm

  19. Poullot S, Buisson O, Crucianu M (2007) Z-grid-based probabilistic retrieval for scaling up content-based copy detection. Paper presented at the Proceedings of the 6th ACM international conference on Image and video retrieval

  20. Rongrong J, Xing X, Hongxun Y, Wei-Ying M (2009) Vocabulary hierarchy optimization for effective and transferable retrieval. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, 20–25 June 2009. pp 1161–1168

  21. Sivic J, Zisserman A (2003) Video Google: a text retrieval approach to object matching in videos. Paper presented at the Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2

  22. Tang J, Yan S, Hong R, Qi G-J, Chua T-S (2009) Inferring semantic concepts from community-contributed images and noisy tags. In Proceedings of the 17th ACM International Conference on Multimedia, 2009, pp. 223-232.

  23. Torralba A, Fergus R, Weiss Y (2008) Small codes and large image databases for recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, CA, USA, 2008. IEEE Computer Society, pp 1–8

  24. Wang W, Zhang D, Zhang Y et al (2011) Robust spatial matching for object retrieval and its parallel implementation on GPU. IEEE Trans Multimed 13(6):1308–1318

    Article  Google Scholar 

  25. Wan-Lei Z, Chong-Wah N, Hung-Khoon T, Xiao W (2007) Near-duplicate keyframe identification with interest point matching and pattern learning. Multimed IEEE Trans 9(5):1037–1048. doi:10.1109/tmm.2007.898928

    Article  Google Scholar 

  26. Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: Proceedings of the 22nd Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, BC, Canada, 2008. doi:citeulike-article-id:9371300

  27. Xie H, Gao K, Zhang Y, Tang S et al (2011) Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans Multimed 13(6):1319–1332

    Article  Google Scholar 

  28. Zhang W, Gao K, Zhang Y, Li J (2011) Efficient approximate nearest neighbor search with integrated binary codes. Paper presented at the Proceedings of the 19th ACM international conference on Multimedia, Scottsdale, Arizona, USA

Download references

Acknowledgments

This work was supported by the National Nature Science Foundation of China (61271428, 61273247), National Key Technology Research and Development Program of China (2012BAH39B02), National Basic Research Program of China (973Program, 2013CB329502) and Co‐building Program of Beijing Municipal Education Commission.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongdong Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Gao, K., Zhang, Y. et al. Efficient binary code indexing with pivot based locality sensitive clustering. Multimed Tools Appl 69, 491–512 (2014). https://doi.org/10.1007/s11042-012-1354-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-1354-z

Keywords

Navigation