Skip to main content

Automatic Image Annotation for Description of Urban and Outdoor Scenes

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 313))

Abstract

In this paper we present a novel approach for automatic annotation of objects or regions in images based on their color and texture. According to the proposed generalized architecture for automatic generation of image content descriptions the detected regions are labeled by developed cascade SVM-based classifier mapping them to structure that reflects their hierarchical and spatial relation used by text generation engine. For testing the designed system for automatic image annotation around 2,000 images with outdoor-indoor scenes from standard IAPR-TC12 image dataset have been processed obtaining an average precision of classification about 75 % with 94 % of recall. The precision of classification based on color features has been improved up to 15 ± 5 % after extension of classifier with texture detector based on Gabor filter. The proposed approach has a good compromise between classification precision of regions in images and speed despite used considerable time processing taking up to 1 s per image. The approach may be used as a tool for efficient automatic image understanding and description.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. D. Zhang, M. Islam, G. Lu, A review on automatic image annotation techniques, in J. Pattern Recognion, vol. 45, 1, 2012, pp. 346–362.

    Google Scholar 

  2. P. Arbelaez, M. Maire, C. Fowlkes y J. Malik, Contour Detection and Hierarchical Image Segmentation, in IEEE Trans. Pattern Anal. Mach. Intelligence, vol. 33, 5, 2011, pp. 898-916.

    Google Scholar 

  3. O. Starostenko, V. Alarcon-Aquino. Computational approaches to support image-based language learning within mobile environments, J. of Mobile Learning and Organisation, 2010, Vol. 4, #2, pp.150–171.

    Google Scholar 

  4. A. Gupta, et al., Understanding videos, constructing plots learning a visually grounded storyline model from annotated videos, in Proc. of IEEE Conf. On Computer Vision and Pattern Recognition, 2009, pp. 2012–2019.

    Google Scholar 

  5. B. Yao, X. Yang, L. Lin,. I2t: Image parsing to text description, in IEEE Special issue on Internet Vision vol. 98 (8), 2010, pp.1485–1508.

    Google Scholar 

  6. Y. Feng, and M. Lapata, How many words are in picture: automatic caption generation for news images, in Proc. of 48th Meeting of Association for Comp. Linguistics, USA, 2010, pp. 1239–1249.

    Google Scholar 

  7. Ifeoma, and Z. Yingbo, DISCO: Describing Images Using Scene Contexts and Objects, in Proc. of 25th AAAI Conference on Artificial Intelligence, San Francisco, CA, August 2011, pp. 1487–1493.

    Google Scholar 

  8. G. Kulkarni, et al., Baby talk: Understanding and generating simple image descriptions, in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2011, pp. 1601–1608.

    Google Scholar 

  9. S. Li, G. Kulkarni, et al., Composing simple image descriptions using web-scale N-grams, in Proc. of 15th Conf. on Computational Natural Language Learning, Stroudsburg, USA, 2011, pp. 220–228.

    Google Scholar 

  10. A. Gupta, L. Davis, Beyond nouns: Exploiting prepositions and comparative adjectives for learning visual classifiers, in Proc. of 10th European Conf. on Comp. Vision, Springer, 2008, pp. 16–29.

    Google Scholar 

  11. A. Farhadi, M. Hejrati, et al., Every picture tells a story: Generating sentences from images, in Lecture Notes in Computer Science, Computer Vision, Springer, vol. 6314 2010, pp. 15–29.

    Google Scholar 

  12. D. Hoiem, A. Efros, M. Hebert, Recovering surface layout from an image. Int. J. Computer Vision, vol. 75, 2007, pp. 151–172.

    Article  Google Scholar 

  13. J. Xiao, J. Hays, and A. Torralba, SUN database: Large-scale scene recognition from abbey to zoo, in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2010, pp. 3485–3492.

    Google Scholar 

  14. C. Schmidt, Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2006, pp. 2169–2178.

    Google Scholar 

  15. A. Torralba, R. Fergus, 80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition, in J. IEEE Trans. Pattern Anal. Mach. Intelligence, 2008, vol. 30 #11, pp. 1958–1970.

    Google Scholar 

  16. P. Perona and J. Malik, Scale-space and edge detection using anisotropic diffusion, in J. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, 1990, pp. 629–639.

    Article  Google Scholar 

  17. H. Escalante, C. Hernández, et al., The segmented and annotated IAPR TC-12 benchmark, in J.Computer Vision and Image Understanding, vol. 114, 2010 pp. 419–428.

    Article  Google Scholar 

  18. J. Alfredo Sánchez, O. Starostenko, Organizing open archives via lightweight ontologies to facilitate the use of heterogeneous collections, J. ASLIB Proceedings, vol. 64, 1, 2012, pp.46–66.

    Google Scholar 

Download references

Acknowledgment

This research is sponsored by European Grant #247083: Security, Services, Networking and Performance of Next Generation IP-based Multimedia Wireless Networks and by Mexican National Council of Science and Technology, CONACyT, project #154438.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudia Cruz-Perez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Cruz-Perez, C., Starostenko, O., Alarcon-Aquino, V., Rodriguez-Asomoza, J. (2015). Automatic Image Annotation for Description of Urban and Outdoor Scenes. In: Sobh, T., Elleithy, K. (eds) Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering. Lecture Notes in Electrical Engineering, vol 313. Springer, Cham. https://doi.org/10.1007/978-3-319-06773-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06773-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06772-8

  • Online ISBN: 978-3-319-06773-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics