Scene Categorization by Introducing Contextual Information to the Visual Words

Qin, Jianzhao; Yung, Nelson H. C.

doi:10.1007/978-3-642-10331-5_28

Jianzhao Qin³⁰ &
Nelson H. C. Yung³⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5875))

Included in the following conference series:

International Symposium on Visual Computing

1591 Accesses
1 Citations

Abstract

In this paper, we propose a novel scene categorization method based on contextual visual words. In this method, we extend the traditional ‘bags of visual words’ model by introducing contextual information from the coarser scale level and neighbor regions to the local region of interest. The proposed method is evaluated over two scene classification datasets of 6,447 images altogether, with 8 and 13 scene categories respectively using 10-fold cross-validation. The experimental results show that the proposed method achieves 90.30% and 87.63% recognition success for Dataset 1 and 2 respectively, which significantly outperforms previous methods based on the visual words that represent the local information in a statistical manner. Furthermore, the proposed method also outperforms the spatial pyramid matching based scene categorization method, one of the scene categorization methods which achieved the best performance on these two datasets reported in previous literatures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wang, J.Z., Jia, L., Wiederhold, G.: Simplicity: semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)
Article Google Scholar
Vailaya, A., Figueiredo, M., Jain, A., Zhang, H.J.: Content-based hierarchical classification of vacation images. In: Figueiredo, M. (ed.) IEEE International Conference on Multimedia Computing and Systems, vol. 1, pp. 518–523 (1999)
Google Scholar
Siagian, C., Itti, L.: Gist: A mobile robotics application of context-based vision in outdoor environment. In: Itti, L. (ed.) 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 3, pp. 1063–1069 (2005)
Google Scholar
Manduchi, R., Castano, A., Talukder, A., Matthies, L.: Obstacle detection and terrain classification for autonomous off-road navigation. Autonomous Robots 18(1), 81–102 (2005)
Article Google Scholar
Luo, J., Savakis, A.: Indoor vs outdoor classification of consumer photographs using low-level and semantic features. In: Savakis, A. (ed.) 2001 International Conference on Image Processing, vol. 2, pp. 745–748 (2001)
Google Scholar
Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 195–203. Springer, Heidelberg (2004)
Google Scholar
Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Fei-Fei, L. (ed.) Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)
Google Scholar
Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)
Article Google Scholar
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477 (2003)
Google Scholar
Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005)
Google Scholar
Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T., Van Gool, L.: Modeling scenes with local descriptors and latent aspects. In: Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 883–890 (2005)
Google Scholar
Bosch, A., Munoz, X., Marti, R.: Which is the best way to organize/classify images by content? Image and Vision Computing 25(6), 778–791 (2007)
Article Google Scholar
Torralba, A.: Contextual priming for object detection. International Journal of Computer Vision 53(2), 169–191 (2003)
Article Google Scholar
Torralba, A., Murphy, K.P., Freeman, W.T.: Contextual models for object detection using boosted random fields. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Adv. in Neural Information Processing Systems 17 (NIPS), pp. 1401–1408. MIT Press, Cambridge (2005)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, vol. 2, pp. 2169–2178 (2006)
Google Scholar
Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 712–727 (2008)
Article Google Scholar
Qin, J., Yung, N.H.C.: Scene categorization with multi-scale category-specific visual words. Optical Engineering (to appear, 2009)
Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)
Google Scholar
Lee, J.J.: Libpmk: A pyramid match toolkit. Technical Report MIT-CSAIL-TR-2008-17, MIT Computer Science and Artificial Intelligence Laboratory (April 2008)
Google Scholar
Bosch, A., Zisserman, A., Munoz, X.: Scene classification via plsa. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)
Chapter Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory for Intelligent Transportation Systems Research, Department of Electrical & Electronic Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong SAR, China
Jianzhao Qin & Nelson H. C. Yung

Authors

Jianzhao Qin
View author publications
You can also search for this author in PubMed Google Scholar
Nelson H. C. Yung
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, University of Nevada, Reno, USA
George Bebis
NASA Ames Research Center, Moffett Field, CA, USA
Richard Boyle
Lawrence Berkeley National Laboratory, Berkeley, CA, USA
Bahram Parvin
Desert Research Institute, Reno, NV, USA
Darko Koracin
Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, 338-8570, Saitama, Japan
Yoshinori Kuno
Institute for Infocomm Research, 21 Heng Mui Keng Terrace, P.O. Box, 119613, Singapore
Junxian Wang
Department of Computer Science & Information Engineering, Tamkang University, Tamsui, Taipei, Taiwan, R.O.C.
Jun-Xuan Wang
Microsoft Research, Redmond, WA, USA
Junxian Wang
Department of Informatics, Univ. of Zurich, Winterthurerstr. 190, P.O. Box, 8057, Zurich, Switzerland
Renato Pajarola
Lawrence Livermore National Laboratory, 94550, Livermore, CA, USA
Peter Lindstrom
University of Applied Sciences Bonn-Rhein-Sieg, 53754, Sankt Augustin, Germany
André Hinkenjann
Humana Inc., 40202, Louisville, KY, USA
Miguel L. Encarnação
SCI Institute & School of Computing, University of Utah, 84112, Salt Lake City, UT, USA
Cláudio T. Silva
Desert Research Institute, 89512, Reno, NV, USA
Daniel Coming

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qin, J., Yung, N.H.C. (2009). Scene Categorization by Introducing Contextual Information to the Visual Words. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2009. Lecture Notes in Computer Science, vol 5875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10331-5_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-10331-5_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10330-8
Online ISBN: 978-3-642-10331-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics