Skip to main content

Exclusive Visual Descriptor Quantization

  • Conference paper
Computer Vision – ACCV 2012 (ACCV 2012)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7724))

Included in the following conference series:

  • 8367 Accesses

Abstract

Vector quantization (VQ) using exhaustive nearest neighbor (NN) search is the speed bottleneck in classic bag of visual words (BOV) models. Approximate NN (ANN) search methods still cost great time in VQ, since they check multiple regions in the search space to reduce VQ errors. In this paper, we propose ExVQ, an exclusive NN search method to speed up BOV models. Given a visual descriptor, a portion of search regions is excluded from the whole search space by a linear projection. We ensure that minimal VQ errors are introduced in the exclusion by learning an accurate classifier. Multiple exclusions are organized in a tree structure in ExVQ, whose VQ speed and VQ error rate can be reliably estimated. We show that ExVQ is much faster than state-of-the-art ANN methods in BOV models while maintaining almost the same classification accuracy. In addition, we empirically show that even with the VQ error rate as high as 30%, the classification accuracy of some ANN methods, including ExVQ, is similar to that of exhaustive search (which has zero VQ error). In some cases, ExVQ has even higher classification accuracy than the exhaustive search.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV, pp. 1470–1477 (2003)

    Google Scholar 

  2. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: CVPR, vol. 2, pp. 524–531 (2005)

    Google Scholar 

  3. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: CVPR, pp. 2169–2178 (2006)

    Google Scholar 

  4. Wang, J., Yang, J., Yu, K., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: CVPR, pp. 3360–3367 (2010)

    Google Scholar 

  5. Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: CVPR, pp. 1794–1801 (2009)

    Google Scholar 

  6. van de Sande, K.E.A., Gevers, T., Snoek, C.G.M.: Empowering visual categorization with the GPU. IEEE Transaction on Multimedia 13, 60–70 (2011)

    Article  Google Scholar 

  7. Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training example: an incremental bayesian approach tested on 101 object categories. In: CVPR (2004)

    Google Scholar 

  8. Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: VISAPP, vol. 1, pp. 331–340 (2009)

    Google Scholar 

  9. Bhattacharya, S., Sukthankar, R., Jin, R., Shah, M.: A probabilistic representation for efficient large scale visual tasks. In: CVPR, vol. 2, pp. 2593–2600 (2011)

    Google Scholar 

  10. Gersho, A., Gray, R.M.: Vector quantization and signal compression. Springer (1991)

    Google Scholar 

  11. Nistér, D., Stewénius, H.: Scalable recognition with a vocabulary tree. In: CVPR, vol. 2, pp. 2161–2168 (2006)

    Google Scholar 

  12. Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Lost in quantization: Improving particular object retrieval in large scale image databases. In: CVPR (2008)

    Google Scholar 

  13. Lepetit, V., Fua, P.: Keypoint recognition using randomized trees. IEEE TPAMI 28, 1465–1479 (2006)

    Article  Google Scholar 

  14. Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE TPAMI 30, 1632–1646 (2008)

    Article  Google Scholar 

  15. Liu, T., Moore, A.W., Gray, A., Yang, K.: An investigation of practical approximate nearest neighbor algorithms. In: NIPS (2004)

    Google Scholar 

  16. Marszałek, M., Schmid, C.: Constructing Category Hierarchies for Visual Recognition. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part IV. LNCS, vol. 5305, pp. 479–491. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  17. Gao, T., Koller, D.: Discriminative learning of relaxed hierarchy for large-scale visual recognition. In: ICCV (2011)

    Google Scholar 

  18. Datar, M., Immorlica, N., Indyk, P., Mirrokni, V.S.: Locality-sensitive hashing scheme based on p-stable distributions. In: Symposium on Computational Geometry, pp. 253–262 (2004)

    Google Scholar 

  19. Jegou, H., Douze, M., Schmid, C.: Improving bag-of-features for large scale image search. IJCV 87, 316–336 (2010)

    Article  Google Scholar 

  20. Weiss, Y., Torralba, A., Fergus, R.: Spectral hashing. In: NIPS, vol. 21, pp. 1753–1760 (2009)

    Google Scholar 

  21. Jegou, H., Douze, M., Schmid, C.: Product quantization for nearest neighbor search. IEEE TPAMI 33, 117–128 (2011)

    Article  Google Scholar 

  22. Li, L.J., Fei-Fei, L.: What, where and who? Classifying events by scene and object recognition. In: ICCV, pp. 261–268 (2007)

    Google Scholar 

  23. Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: CVPR, pp. 413–420 (2009)

    Google Scholar 

  24. Wu, J., Rehg, J.M.: CENTRIST: A visual descriptor for scene categorization. IEEE TPAMI 33, 1489–1501 (2011)

    Article  Google Scholar 

  25. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)

    MATH  Google Scholar 

  26. Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. IEEE TPAMI 34, 480–492 (2012)

    Article  Google Scholar 

  27. Wu, J.: Power mean SVM for large scale visual classification. In: CVPR, pp. 2344–2351 (2012)

    Google Scholar 

  28. Wu, J., Rehg, J.M.: Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel. In: ICCV, pp. 630–637 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, Y., Wu, J., Lin, W. (2013). Exclusive Visual Descriptor Quantization. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37331-2_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37330-5

  • Online ISBN: 978-3-642-37331-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics