Abstract
The development of technology generates huge amounts of non-textual information, such as images. An efficient image annotation and retrieval system is highly desired. Clustering algorithms make it possible to represent visual features of images with finite symbols. Based on this, many statistical models, which analyze correspondence between visual features and words and discover hidden semantics, have been published. These models improve the annotation and retrieval of large image databases. However, current state of the art including our previous work produces too many irrelevant keywords for images during annotation. In this paper, we propose a novel approach that augments the classical model with generic knowledge-based, WordNet. Our novel approach strives to prune irrelevant keywords by the usage of WordNet. To identify irrelevant keywords, we investigate various semantic similarity measures between keywords and finally fuse outcomes of all these measures together to make a final decision. We have implemented various models to link visual tokens with keywords based on knowledge-based, WordNet and evaluated performance using precision, and recall using benchmark dataset. The results show that by augmenting knowledge-based with classical model we can improve annotation accuracy by removing irrelevant keywords.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)
Blei, D., Jordan, M.: Modeling annotated data. In: 26th Annual Int. ACM SIGIR Conf., Toronto, Canada (2003)
Banerjee, S., Pedersen, T.: An adpated Lesk algorithm for word sense disambiguation using WordNet. In: Proceedings of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Pittsburgh (2001)
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pp. 805–810 (2003)
http://kdd.ics.uci.edu/databases/CorelFeatures/CorelFeatures.data.html
Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
http://www.cs.arizona.edu/people/kobus/research/data/eccv_2002
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: 26th Annual Int. ACM SIGIR Conference, Toronto, Canada (2003)
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Procedeeings on International Conference on Research in Computational Linguistics, Taiwan (1997)
Kang, F., Jin, R., Chai, J.Y.: Regularizing Translation Models for Better Automatic Image Annotation. In: CIKM 2004, pp. 350–359 (2004)
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(10) (2003)
Leacock, C., Chodorow, M.: Combining Local Context and WordNet Similarity for Word Sense Identification. In: Fellbaum, C. (ed.) WordNet: An electronic lexical database, pp. 265–283. MIT Press, Cambridge (1998)
Lesk, M.: Automatic sense disambiguation machine readable dictionaries: How to tell a pine cone from an ice cream cone. In: Proceedings of SIGDOC 1986 (1986)
Lin, D.: Using syntatic dependency as a local context to reslove word sense ambiguity. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, pp. 64–71 (1997)
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: MISRM 1999 Frist International Workshop on Multimedia Intellegent Storage and Retrieval Management (1999)
Miller, G.: WordNet: An on-line lexical database. International Journal of Lexicography 3(4) (1990)
Pan, J.Y., Yang, H.J., Faloutsos, C., Duygulu, P.: Automatic Multimedia Cross-modal Correlation Discovery. In: KDD 2004, Seattle, WA (August 2004)
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588. Springer, Heidelberg (2003)
Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence (1995)
Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. In: IEEE Conf. Computer Vision and Pattern Recognition(CVPR), Puerto Rico (1997)
Wang, L., Khan, L.: Automatic Image Annotation and Retrieval using Weighted Feature Selection. To appear in a special issue in Multimedia Tools and Applications. Kluwer Publisher, Dordrecht (2005)
Yang, Y., Carbonell, J.G., Brown, R.D., Frederking, R.E.: Translingual Information Retrieval: Learning from Bilingual Corpora. Artificial Intelligence 103(1-2), 323–345 (2003)
Zhao, R., Grosky, W.: Narrowing the semantic gap - improved text-based web document retrieval using visual features. IEEE Trans. on Multimedia 4(2), 189–200 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jin, Y., Wang, L., Khan, L. (2005). Improving Image Annotations Using WordNet. In: Candan, K.S., Celentano, A. (eds) Advances in Multimedia Information Systems. MIS 2005. Lecture Notes in Computer Science, vol 3665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551898_12
Download citation
DOI: https://doi.org/10.1007/11551898_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28792-6
Online ISBN: 978-3-540-31945-0
eBook Packages: Computer ScienceComputer Science (R0)