Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting

Zhang, Chunjie; Liu, Jing; Wang, Jinqiao; Tian, Qi; Xu, Changsheng; Lu, Hanqing; Ma, Songde

doi:10.1007/978-3-642-19318-7_19

Chunjie Zhang¹⁹,
Jing Liu¹⁹,
Jinqiao Wang¹⁹,
Qi Tian²⁰,
Changsheng Xu¹⁹,
Hanqing Lu¹⁹ &
…
Songde Ma¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6494))

Included in the following conference series:

Asian Conference on Computer Vision

3071 Accesses
5 Citations

Abstract

The ignorance on spatial information and semantics of visual words becomes main obstacles in the bag-of-visual-words (BoW) method for image classification. To address the obstacles, we present an improved BoW representation using spatial pyramid coding (SPC) and visual word reweighting. In SPC procedure, we adopt the sparse coding technique to encode visual features with the spatial constraint. Visual features from the same spatial sub-region of images are collected to generate the visual vocabulary. Additionally, a relaxed but simple solution for semantic embedding into visual words is proposed. We relax the semantic embedding from ideal semantic correspondence to naive semantic purity of visual words, and reweight each visual word according to its semantic purity. Higher weights are given to semantically distinctive visual words, and lower weights to semantically general ones. Experiments on a public dataset demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

Article 07 September 2020

Large Scale Image Retrieval with Practical Spatial Weighting for Bag-of-Visual-Words

References

Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)
Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Proc. CVPR (2009)
Google Scholar
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: Proc. CVPR (2005)
Google Scholar
Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., Gong, Y.: Locality-constrained linear coding for image classification. In: Proc. CVPR (2010)
Google Scholar
Mairal, J., Bach, F., Ponce, J., Sapiro, G., Zisserman, A.: Supervised Dictionary Learning. In: Proc. ECCV (2008)
Google Scholar
Lazebnik, S., Raginsky, M.: Supervised learning of quantizer codebooks by information loss minimization. PAMI (2009)
Google Scholar
Perronnin, F., Dance, C., Csurka, G., Bressan, M.: Adapted vocabularies for generic visual categorization. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 464–475. Springer, Heidelberg (2006)
Chapter Google Scholar
Jurie, F., Triggs, B.: Creating efficient codebooks for visual recognition. In: Proc. ICCV, pp. 17–21 (2005)
Google Scholar
Moosmann, F., Nowak, E., Jurie, F.: Randomized clustering forests for image classification. IEEE Trans. on Pattern Analysis and Machine Intelligence 30(9), 1632–1646 (2008)
Article Google Scholar
Boiman, O., Shechtman, E., Irani, M.: In defense of nearest-neighbor based image classification. In: Proc. CVPR (2008)
Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Trans. on Pattern Analysis and Machine Intelligence (2008)
Google Scholar
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: CVPR (2005)
Google Scholar
Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In: WGMBV (2004)
Google Scholar
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical report, CalTech (2007)
Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42(3) (2001)
Google Scholar
Gemert, J., Veenman, C., Smeulders, A., Geusebroek, J.: Visual word ambiguity. IEEE Transactions and Pattern Analysis and Machine Intelligence
Google Scholar
Zhang, H., Berg, A., Maire, M., Malik, J.: Svm-knn: Discriminative nearest neighbor classification for visual category recognition. In: Proc. CVPR (2006)
Google Scholar
Sivic, J.S., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proc. ICCV, vol. 2, pp. 1470–1477 (2003)
Google Scholar
Grauman, K., Darrell, T.: The pyramid match kernel: discriminative classification with sets of image features. In: Proc. ICCV, pp.1458–1465 (2005)
Google Scholar
Yu, K., Zhang, T., Gong, Y.: Nonlinear learning using local coordinate coding. In: Proc. NIPS (2009)
Google Scholar
Boureau, Y.-L., Bach, F., LeCun, Y., Ponce, J.: Learning mid-level features for recognition. In: Proc. CVPR (2010)
Google Scholar
Tsotsos, J.: Analyzing vision at the complexity level. Behav. Brain Sci. 13, 423–469 (1990)
Article Google Scholar
Chen, X., Zelinsky, G.J.: Real-world visual search is dominated by top-down guidance. Vision Research 46, 4118–4133 (2006)
Article Google Scholar
Liu, D., Hua, G., Viola, P., Chen, T.: Integrated feature selection and higher-order spatial feature extraction for object categorization. In: Proc. CVPR (2008)
Google Scholar
Mutch, J., Lowe, D.G.: Multiclass object recognition with sparse, localized features. In: Proc. CVPR (2006)
Google Scholar
Cai, H., Yan, F., Mikolajczyk, K.: Learning weights for codebook in image classification and retrieval. In: Proc. CVPR (2010)
Google Scholar
Lee, H., Battle, A., Raina, R., Ng, A.: Efficient sparse coding algorithms. In: Advances in Neural Information Processing Systems, pp. 801–808. MIT Press, Cambridge (2007)
Google Scholar
Zhang, C., Liu, J., Ouyang, Y., Tian, Q., Lu, H., Ma, S.: Category sensitive codebook construction for object category recognition. In: ICIP (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, P.O. Box 2728, Beijing, China
Chunjie Zhang, Jing Liu, Jinqiao Wang, Changsheng Xu, Hanqing Lu & Songde Ma
University of Texas at San Antonio, One UTSA Circle, San Antonio Texas, 78249, USA
Qi Tian

Authors

Chunjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jinqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Tian
View author publications
You can also search for this author in PubMed Google Scholar
Changsheng Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hanqing Lu
View author publications
You can also search for this author in PubMed Google Scholar
Songde Ma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Technion – Israel Institute of Technology, Department of Computer Science, 32000, Haifa, Israel
Ron Kimmel
The University of Auckland, 37 Kohimarama Road , Mission Bay, 1071, Auckland, New Zealand
Reinhard Klette
National Institute of Informatics, Chiyoda, 1018430, Tokyo, Japan
Akihiro Sugimoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, C. et al. (2011). Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting. In: Kimmel, R., Klette, R., Sugimoto, A. (eds) Computer Vision – ACCV 2010. ACCV 2010. Lecture Notes in Computer Science, vol 6494. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19318-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-19318-7_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19317-0
Online ISBN: 978-3-642-19318-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting

Abstract

Access this chapter

Preview

Similar content being viewed by others

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

Large Scale Image Retrieval with Practical Spatial Weighting for Bag-of-Visual-Words

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Image Classification Using Spatial Pyramid Coding and Visual Word Reweighting

Abstract

Access this chapter

Preview

Similar content being viewed by others

Discriminative Image Representation for Classification

Scale-space multi-view bag of words for scene categorization

Large Scale Image Retrieval with Practical Spatial Weighting for Bag-of-Visual-Words

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation