Automatic Image Annotation with Cooperation of Concept-Specific and Universal Visual Vocabularies

Wang, Yanjie; Liu, Xiabi; Jia, Yunde

doi:10.1007/978-3-642-11301-7_28

Yanjie Wang²¹,
Xiabi Liu²¹ &
Yunde Jia²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5916))

Included in the following conference series:

International Conference on Multimedia Modeling

2075 Accesses
2 Citations

Abstract

This paper proposes an automatic image annotation method based on concept-specific image representation and discriminative learning. Firstly, the concept-specific visual vocabularies are generated by assuming that localized features from the images with a specific concept are of the distribution of Gaussian Mixture Model (GMM). Each component in the GMM is taken as a visual token of the concept. The visual tokens of all the concepts are clustered to obtain a universal token set. Secondly, the image is represented as a concept-specific feature vector by computing the average posterior probabilities of being each universal visual token for all the localized features and assigning it to corresponding concept-specific visual tokens. Thus the feature vector for an image varies with different concepts. Finally, we implement image annotation and retrieval under a discriminative learning framework of Bayesian classifiers, Max-Min posterior Pseudo-probabilities (MMP). The proposed method were evaluated on the popular Corel-5K database. The experimental results with comparisons to state-of-the-art show that our method is promising.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blei, D.M., Jordan, M.I.: Modeling annotated data. In: ACM SIGIR Conference (2003)
Google Scholar
Carneiro, G., Chan, A., Moreno, P., Vasconcelos, N.: Supervised learning of semantic classes for image annotation and retrieval. IEEE Transaction on Pattern Analysis and Machine intelligence 29(3), 394–410 (2007)
Article Google Scholar
Chang, E., Goh, K., Sychay, G., Wu, G.: CBSA: Content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Transactions on Circuits and Systems for Video Technology 13(1), 26–38 (2003)
Article Google Scholar
Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society 39(1), 1–38 (1977)
MATH MathSciNet Google Scholar
Duygulu, P., Barnard, K., de Freitas, J.F.G., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Farquhar, J., Szedmak, S., Meng, H., Shawe-Taylor, J.: Improving “bag-of-keypoints” image categorisation: Generative models and pdf-kernels. Technical report, University of Southampton (2005)
Google Scholar
Hansen, M.H., Yu, B.: Model selection and the principle of minimum description length. Journal of American Statistical Association 96(454), 746–774 (2001)
Article MATH MathSciNet Google Scholar
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: ACM SIGIR Conference (2003)
Google Scholar
Jin, R., Chai, J.Y., Si, L.: Effective automatic image annotation via a coherent language model and active learning. In: ACM Multimedia Conference (2004)
Google Scholar
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures. In: Neural Information Processing Systems (2003)
Google Scholar
Li, J., Wang, J.Z.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1075–1078 (2003)
Article Google Scholar
Liu, X., Jia, Y., Chen, X., Deng, Y., Fu, H.: Image classification using the max-min posterior pseudo-probabilities method. Technical Report BIT-CS-20080001, Beijing Institute of Technology (2008), http://www.mcislab.org.cn/member/~xiabi/papers/2008_1.PDF
Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: ACM Multimedia Conference (2003)
Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Workshop Multimedia Intelligent Storage and Retrieval Management (1999)
Google Scholar
Perronnin, F.: Univeral and adapted vocabularies for generic visual categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(7), 1243–1256 (2008)
Article Google Scholar
Feng, R.M.S., Freitas, D.: Multiple bernoulli relevance models for image and video annotation. In: IEEE Conference on Computer Vision and Pattern Recognition (2004)
Google Scholar
Wang, X., Zhang, L., Li, X., Ma, W.: Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(11), 1919–1932 (2008)
Article Google Scholar
Winn, J., Criminisi, A., Minka, T.: Object categorization by learned universal visual dictionary. In: IEEE International Conference on Computer Vision (2005)
Google Scholar
Zhang, R., Zhang, Z.: Effective image retrieval based on hidden concept discovery in image database. IEEE Transactions on Image Processing 16(2), 562–572 (2007)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, P.R. China
Yanjie Wang, Xiabi Liu & Yunde Jia

Authors

Yanjie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiabi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yunde Jia
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Oldenburg, Germany
Susanne Boll
University of Texas at San Antonio,, TX, San Antonio, USA
Qi Tian
Microsoft Research Asia, Beijing, P.R. China
Lei Zhang
Southwest University, Beibei, Chongqing, China
Zili Zhang
School of Engineering and Information Technology, Deakin University, 221 Burwood Highway, Vic, 3125, Australia
Yi-Ping Phoebe Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, Y., Liu, X., Jia, Y. (2010). Automatic Image Annotation with Cooperation of Concept-Specific and Universal Visual Vocabularies. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, YP.P. (eds) Advances in Multimedia Modeling. MMM 2010. Lecture Notes in Computer Science, vol 5916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11301-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-11301-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11300-0
Online ISBN: 978-3-642-11301-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics