Two-Probabilistic Latent Semantic Model for Image Annotation and Retrieval

  • Nattachai Watcharapinchai
  • Supavadee Aramvith
  • Supakorn Siddhichai
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6468)


A novel latent variable modeling technique for image annotation and retrieval is proposed. This model is useful for annotating the images with relevant semantic meanings as well as for retrieving images which satisfy the users query with specific text or image. The framework of two-step latent variable is proposed to support multi-functionality of the retrieval and annotation system. Furthermore, the existing and the proposed image annotation models are compared in terms of their annotating performance. Images from standard databases are used in the comparison in order to identify the best model for automatic image annotation, using precision-recall measurement. Local features, or visual words, of each image in the database are extracted using Scale-Invariant Feature Transform (SIFT) and clustering techniques. Each image is then represented by Bag-of-Features (BoF) which is a histogram of visual words. Semantic meanings can then be related to each BoF using latent variable for annotation purposes. Subsequently, for image retrieval, each image query is also related to semantic meanings. Finally, image retrieval results are obtained by matching semantic meanings of the query with those of the images in the database using a second latent variable.


Image Retrieval Visual Word Latent Semantic Analysis Image Annotation Image Retrieval System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Russell, B.C., Torralba, A.: LabelMe: a database and web-based tool for image annotation. Intl. J. Computer Vision 77, 157–173 (2008)CrossRefGoogle Scholar
  2. 2.
    Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning 41(2), 177–196 (2001)CrossRefzbMATHGoogle Scholar
  3. 3.
    Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T.: A Thousand Words in a Scene. IEEE Trans. Pattern Analysis and Machine Intelligence 29(9), 1575–1589 (2007)CrossRefGoogle Scholar
  4. 4.
    Monay, F., Gatica-Perez, D.: Modeling Semantic Aspects for Cross-Media Image Indexing. IEEE Trans. Pattern Analysis and Machine Intelligence 29(10), 1802–1817 (2007)CrossRefGoogle Scholar
  5. 5.
    Blei, D., Jordan, M.: Modeling Annotated Data. In: Proc. Intl. Conf. Research and Development in Information Retrieval (2003)Google Scholar
  6. 6.
    Fei-Fei, L., Perona, P.: A Bayesian Hierarchical Model for learning Natural Scene Categories. In: Intl. IEEE Conf. Computer Vision and Pattern Recogntion, vol. 2, pp. 20–25 (2005)Google Scholar
  7. 7.
    Weber, M., Welling, M., Perona, P.: Unsupervised Learning of Models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  8. 8.
    Duygula, P., Barnard, K., De Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Zhang, R., Zhang, Z., Li, M., Ma, W.-Y., Zhang, H.-J.: A probabilistic semantic model for image annotation and multi-model image retrieval. Multimedia Systems 12, 27–33 (2006)CrossRefGoogle Scholar
  10. 10.
    Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proc. Intl. Conf. Research and Development in Information Retrieval, SIGIR (2003)Google Scholar
  11. 11.
    Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proc. of Advances in Neural Information Processing Systems (2003)Google Scholar
  12. 12.
    Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Intl. Conf. Computer Vision and Recognition, vol. 2, pp. II–2002–II–1009 (2004)Google Scholar
  13. 13.
    Huang, P., Bu, J., Chen, C., Liu, K., Qiu, G.: Improve Image Annotation by combining Multiple Models. In: Intl. IEEE Conf. Signal-Image Technologies and Internet-based system (2008)Google Scholar
  14. 14.
    Pham, T.-T., Maillot, N.E., Lim, J.-H., Chevallet, J.-P.: Latent Semantic Fusion Model for Image Retrieval and Annotation. In: Proc. ACM Conf. Information and Knowledge Management, pp. 439–444 (2007)Google Scholar
  15. 15.
    Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proc. of Intl. Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)Google Scholar
  16. 16.
    Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: MIR 2006. ACM Press, New York (2006)Google Scholar
  17. 17.
    Everingham, M., Van-Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Classes Challenge, VOC 2008 Results (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nattachai Watcharapinchai
    • 1
  • Supavadee Aramvith
    • 1
  • Supakorn Siddhichai
    • 2
  1. 1.Department of Electrical EngineeringChulalongkorn UniversityThailand
  2. 2.National Electronics and Computer Technology Center (NECTEC)Thailand

Personalised recommendations