Empirical Investigations on Benchmark Tasks for Automatic Image Annotation

Viitaniemi, Ville; Laaksonen, Jorma

doi:10.1007/978-3-540-76414-4_10

Ville Viitaniemi¹ &
Jorma Laaksonen¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4781))

Included in the following conference series:

International Conference on Advances in Visual Information Systems

1011 Accesses
2 Citations

Abstract

Automatic image annotation aims at labeling images with keywords. In this paper we investigate three annotation benchmark tasks used in literature to evaluate annotation systems’ performance. We empirically compare the first two of the tasks, the 5000 Corel images and the Corel categories tasks, by applying a family of annotation system configurations derived from our PicSOM image content analysis framework. We establish an empirical correspondence of performance levels in the tasks by studying the performance of our system configurations, along with figures presented in literature. We also consider ImageCLEF 2006 Object Annotation Task that has earlier been found difficult. By experimenting with the data, we gain insight into the reasons that make the ImageCLEF task difficult. In the course of our experiments, we demonstrate that in these three tasks the PicSOM system—based on fusion of numerous global image features—outperforms the other considered annotation methods.

Supported by the Academy of Finland in the projects Neural methods in information retrieval based on automatic content analysis and relevance feedback and Finnish Centre of Excellence in Adaptive Informatics Research. Special thanks to Kobus Barnard, Xiaojun Qi and Yutao Han for helping with the experimental setup.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Andrews, S., Tsochantaridis, I., Hoffman, T.: Support vector machines for multiple-instance learning. In: Advances in Neural Information Processing Systems 15, pp. 561–568. MIT Press, Cambridge (2003)
Google Scholar
Carneiro, G., Vasconcelos, N.: Formulating semantic image annotation as supervised learning problem. In: Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 163–168. IEEE Computer Society Press, Los Alamitos (2005)
Google Scholar
Celebi, E., Alpkocak, A.: Combining textual and visual clusters for semantic image retrieval and auto-annotation. In: EWIMT. Proc. of European Workshop on the Integration of Knowledge, Semantic and Digital Media Technologies, UK, pp. 219–225 (November 2005)
Google Scholar
Chen, Y., Zwang, J.Z.: Image categorization by learning and reasoning with regions. Journal of Machine Learning Research 5, 913–939 (2004)
Google Scholar
Clough, P., Grubinger, M., Deselaers, T., Hanbury, A., Müller, H.: Overview of the ImageCLEF 2006 photographic retrieval and object annotation tasks. In: CLEF working notes, Alicante, Spain (September 2006)
Google Scholar
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 97–112. Springer, Heidelberg (2002)
Google Scholar
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli relevance models for image and video annotation. Proc. of IEEE CVPR 2, 1002–1009 (2004)
Google Scholar
ISO/IEC. Information technology - Multimedia content description interface - Part 3: Visual, 15938-3:2002(E) (2002)
Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Canada, pp. 119–126 (July-August 2003)
Google Scholar
Jeon, J., Manmatha, R.: Using maximum entropy for automatic image annotation. In: Proc. of International Conference on Image and Video Retrieval, pp. 24–32 (2004)
Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (2001)
MATH Google Scholar
Koikkalainen, P., Oja, E.: Self-organizing hierarchical feature maps. In: Proc. IJCNN, San Diego, CA, USA, vol. II, pp. 279–284 (1990)
Google Scholar
Laaksonen, J., Koskela, M., Oja, E.: PicSOM—Self-organizing image retrieval with MPEG-7 content descriptions. IEEE Transactions on Neural Networks 13(4), 841–853 (2002)
Article Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Proc. NIPS, vol. 16, pp. 553–560 (2003)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
LTU Technologies: (Accessed 2007-5-18), http://www.LTUtech.com
Metzler, D., Manmatha, R.: An inference network approach to image retrieval. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 42–50. Springer, Heidelberg (2004)
Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proc. of First International Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Google Scholar
Qi, X., Han, Y.: Incorporating multiple SVMs for automatic image annotation. Pattern Recognition 40, 728–741 (2007)
Article MATH Google Scholar
Viitaniemi, V., Laaksonen, J.: Evaluating performance of automatic image annotation: example case by fusing global image features. In: Proc. of International Workshop on Content-Based Multimedia Indexing, Bordeaux, France (June 2007)
Google Scholar
Yavlinsky, A., Schofield, E., Rüger, S.: Automated image annotation using global features and robust nonparametric density estimation. In: Leow, W.-K., Lew, M.S., Chua, T.-S., Ma, W.-Y., Chaisorn, L., Bakker, E.M. (eds.) CIVR 2005. LNCS, vol. 3568, pp. 507–517. Springer, Heidelberg (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Adaptive Informatics Research Centre, Helsinki University of Technology, P.O. Box 5400, FIN-02015 TKK, Finland
Ville Viitaniemi & Jorma Laaksonen

Authors

Ville Viitaniemi
View author publications
You can also search for this author in PubMed Google Scholar
Jorma Laaksonen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Guoping Qiu Clement Leung Xiangyang Xue Robert Laurini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Viitaniemi, V., Laaksonen, J. (2007). Empirical Investigations on Benchmark Tasks for Automatic Image Annotation . In: Qiu, G., Leung, C., Xue, X., Laurini, R. (eds) Advances in Visual Information Systems. VISUAL 2007. Lecture Notes in Computer Science, vol 4781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76414-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-76414-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76413-7
Online ISBN: 978-3-540-76414-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics