Improving Image Annotations Using WordNet

Jin, Yohan; Wang, Lei; Khan, Latifur

doi:10.1007/11551898_12

Yohan Jin¹⁸,
Lei Wang¹⁸ &
Latifur Khan¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3665))

Included in the following conference series:

International Workshop on Multimedia Information Systems

325 Accesses
3 Citations

Abstract

The development of technology generates huge amounts of non-textual information, such as images. An efficient image annotation and retrieval system is highly desired. Clustering algorithms make it possible to represent visual features of images with finite symbols. Based on this, many statistical models, which analyze correspondence between visual features and words and discover hidden semantics, have been published. These models improve the annotation and retrieval of large image databases. However, current state of the art including our previous work produces too many irrelevant keywords for images during annotation. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision. We have implemented various models to link visual tokens with keywords based on knowledge-based, WordNet and evaluated performance using precision, and recall using benchmark dataset. The results show that by augmenting knowledge-based with classical model we can improve annotation accuracy by removing irrelevant keywords.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Article MATH Google Scholar
Blei, D., Jordan, M.: Modeling annotated data. In: 26th Annual Int. ACM SIGIR Conf., Toronto, Canada (2003)
Google Scholar
Banerjee, S., Pedersen, T.: An adpated Lesk algorithm for word sense disambiguation using WordNet. In: Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Pittsburgh (2001)
Google Scholar
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 805–810 (2003)
Google Scholar
http://corel.digitalriver.com/
http://kdd.ics.uci.edu/databases/CorelFeatures/CorelFeatures.data.html
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
http://www.cs.arizona.edu/people/kobus/research/data/eccv_2002
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: 26th Annual Int. ACM SIGIR Conference, Toronto, Canada (2003)
Google Scholar
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Procedeeings on International Conference on Research in Computational Linguistics, Taiwan (1997)
Google Scholar
Kang, F., Jin, R., Chai, J.Y.: Regularizing Translation Models for Better Automatic Image Annotation. In: CIKM 2004, pp. 350–359 (2004)
Google Scholar
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(10) (2003)
Google Scholar
Leacock, C., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. In: Fellbaum, C. (ed.) WordNet: An electronic lexical database, pp. 265–283. MIT Press, Cambridge (1998)
Google Scholar
Lesk, M.: Automatic sense disambiguation machine readable dictionaries: How to tell a pine cone from an ice cream cone. In: Proceedings of SIGDOC 1986 (1986)
Google Scholar
Lin, D.: Using syntatic dependency as a local context to reslove word sense ambiguity. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pp. 64–71 (1997)
Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: MISRM 1999 Frist International Workshop on Multimedia Intellegent Storage and Retrieval Management (1999)
Google Scholar
Miller, G.: WordNet: An on-line lexical database. International Journal of Lexicography 3(4) (1990)
Google Scholar
Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Automatic Multimedia Cross-modal Correlation Discovery. In: KDD 2004, Seattle, WA (August 2004)
Google Scholar
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588. Springer, Heidelberg (2003)
Chapter Google Scholar
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (1995)
Google Scholar
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. In: IEEE Conf. Computer Vision and Pattern Recognition(CVPR), Puerto Rico (1997)
Google Scholar
Wang, L., Khan, L.: Automatic Image Annotation and Retrieval using Weighted Feature Selection. To appear in a special issue in Multimedia Tools and Applications. Kluwer Publisher, Dordrecht (2005)
Google Scholar
Yang, Y., Carbonell, J.G., Brown, R.D., Frederking, R.E.: Translingual Information Retrieval: Learning from Bilingual Corpora. Artificial Intelligence 103(1-2), 323–345 (2003)
Article Google Scholar
Zhao, R., Grosky, W.: Narrowing the semantic gap - improved text-based web document retrieval using visual features. IEEE Trans. on Multimedia 4(2), 189–200 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Texas at Dallas Richardson, Texas, 75083-0688, USA
Yohan Jin, Lei Wang & Latifur Khan

Authors

Yohan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Latifur Khan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Comp. Sci. and Eng. Dept., Arizona State University, Tempe, AZ, 85287
K. Selçuk Candan
Dipartimento di Informatica, Università Ca’ Foscari di Venezia, via Torino 155, 30172, Mestre, (VE), Italy
Augusto Celentano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jin, Y., Wang, L., Khan, L. (2005). Improving Image Annotations Using WordNet. In: Candan, K.S., Celentano, A. (eds) Advances in Multimedia Information Systems. MIS 2005. Lecture Notes in Computer Science, vol 3665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551898_12

Download citation

DOI: https://doi.org/10.1007/11551898_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28792-6
Online ISBN: 978-3-540-31945-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics