Skip to main content

Scene Categorization by Introducing Contextual Information to the Visual Words

  • Conference paper
Advances in Visual Computing (ISVC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5875))

Included in the following conference series:

Abstract

In this paper, we propose a novel scene categorization method based on contextual visual words. In this method, we extend the traditional ‘bags of visual words’ model by introducing contextual information from the coarser scale level and neighbor regions to the local region of interest. The proposed method is evaluated over two scene classification datasets of 6,447 images altogether, with 8 and 13 scene categories respectively using 10-fold cross-validation. The experimental results show that the proposed method achieves 90.30% and 87.63% recognition success for Dataset 1 and 2 respectively, which significantly outperforms previous methods based on the visual words that represent the local information in a statistical manner. Furthermore, the proposed method also outperforms the spatial pyramid matching based scene categorization method, one of the scene categorization methods which achieved the best performance on these two datasets reported in previous literatures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, J.Z., Jia, L., Wiederhold, G.: Simplicity: semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)

    Article  Google Scholar 

  2. Vailaya, A., Figueiredo, M., Jain, A., Zhang, H.J.: Content-based hierarchical classification of vacation images. In: Figueiredo, M. (ed.) IEEE International Conference on Multimedia Computing and Systems, vol. 1, pp. 518–523 (1999)

    Google Scholar 

  3. Siagian, C., Itti, L.: Gist: A mobile robotics application of context-based vision in outdoor environment. In: Itti, L. (ed.) 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 3, pp. 1063–1069 (2005)

    Google Scholar 

  4. Manduchi, R., Castano, A., Talukder, A., Matthies, L.: Obstacle detection and terrain classification for autonomous off-road navigation. Autonomous Robots 18(1), 81–102 (2005)

    Article  Google Scholar 

  5. Luo, J., Savakis, A.: Indoor vs outdoor classification of consumer photographs using low-level and semantic features. In: Savakis, A. (ed.) 2001 International Conference on Image Processing, vol. 2, pp. 745–748 (2001)

    Google Scholar 

  6. Vogel, J., Schiele, B.: A semantic typicality measure for natural scene categorization. In: Rasmussen, C.E., Bülthoff, H.H., Schölkopf, B., Giese, M.A. (eds.) DAGM 2004. LNCS, vol. 3175, pp. 195–203. Springer, Heidelberg (2004)

    Google Scholar 

  7. Fergus, R., Fei-Fei, L., Perona, P., Zisserman, A.: Learning object categories from google’s image search. In: Fei-Fei, L. (ed.) Tenth IEEE International Conference on Computer Vision, vol. 2, pp. 1816–1823 (2005)

    Google Scholar 

  8. Agarwal, S., Awan, A., Roth, D.: Learning to detect objects in images via a sparse, part-based representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(11), 1475–1490 (2004)

    Article  Google Scholar 

  9. Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: Ninth IEEE International Conference on Computer Vision, pp. 1470–1477 (2003)

    Google Scholar 

  10. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 524–531 (2005)

    Google Scholar 

  11. Quelhas, P., Monay, F., Odobez, J.M., Gatica-Perez, D., Tuytelaars, T., Van Gool, L.: Modeling scenes with local descriptors and latent aspects. In: Tenth IEEE International Conference on Computer Vision, vol. 1, pp. 883–890 (2005)

    Google Scholar 

  12. Bosch, A., Munoz, X., Marti, R.: Which is the best way to organize/classify images by content? Image and Vision Computing 25(6), 778–791 (2007)

    Article  Google Scholar 

  13. Torralba, A.: Contextual priming for object detection. International Journal of Computer Vision 53(2), 169–191 (2003)

    Article  Google Scholar 

  14. Torralba, A., Murphy, K.P., Freeman, W.T.: Contextual models for object detection using boosted random fields. In: Saul, L.K., Weiss, Y., Bottou, L. (eds.) Adv. in Neural Information Processing Systems 17 (NIPS), pp. 1401–1408. MIT Press, Cambridge (2005)

    Google Scholar 

  15. Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, vol. 2, pp. 2169–2178 (2006)

    Google Scholar 

  16. Bosch, A., Zisserman, A., Muoz, X.: Scene classification using a hybrid generative/discriminative approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(4), 712–727 (2008)

    Article  Google Scholar 

  17. Qin, J., Yung, N.H.C.: Scene categorization with multi-scale category-specific visual words. Optical Engineering (to appear, 2009)

    Google Scholar 

  18. Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)

    Google Scholar 

  19. Lee, J.J.: Libpmk: A pyramid match toolkit. Technical Report MIT-CSAIL-TR-2008-17, MIT Computer Science and Artificial Intelligence Laboratory (April 2008)

    Google Scholar 

  20. Bosch, A., Zisserman, A., Munoz, X.: Scene classification via plsa. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 517–530. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  21. Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qin, J., Yung, N.H.C. (2009). Scene Categorization by Introducing Contextual Information to the Visual Words. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2009. Lecture Notes in Computer Science, vol 5875. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10331-5_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10331-5_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10330-8

  • Online ISBN: 978-3-642-10331-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics