Advertisement

Abstract

In this paper we introduce a new approach aimed at solving the problem of image retrieval from text queries. We propose to estimate the word relevance of an image using a neighborhood-based estimator. This estimation is obtained by counting the number of word-relevant images among the K-neighborhood of the image. To this end a Bayesian approach is adopted to define such a neighborhood. The local estimations of all the words that form a query are naively combined in order to score the images according to that query. The experiments show that the results are better and faster than the state-of-the-art techniques. A special consideration is done for the computational behaviour and scalability of the proposed approach.

Keywords

Image Retrieval Visual Word Training Image Image Representation Image Annotation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Barnard, K., Duygulu, P., de Freitas, N., Forsyth, D., Blei, D., Jordan, M.I.: Matching words and pictures. Journal of Machine Learning Research 3, 1107–1135 (2003)zbMATHCrossRefGoogle Scholar
  2. 2.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV (2004)Google Scholar
  3. 3.
    Deselaers, T., Keysers, D., Ney, H.: Discriminative training for object recognition using image patches. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2005), pp. 157–162 (2005)Google Scholar
  4. 4.
    Duygulu, P., Barnard, K., de Freitas, N., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Seventh European Conference on Computer Vision, vol. IV, pp. 97–112 (2002)Google Scholar
  5. 5.
    Escalante, H.J., Hernández, C., Gonzalez, J., López, A., Montes, M., Morales, E., Sucar, E., Grubinger, M.: The segmented and annotated iapr tc-12 benchmark. Computer Vision and Image Understanding (2009)Google Scholar
  6. 6.
    Everson, R.M., Fieldsend, J.E.: A variable metric probabilistic k-nearest-neighbours classifier. In: Intelligent Data Engineering and Automated Learning-IDEAL, pp. 654–659 (2004)Google Scholar
  7. 7.
    Gionis, A., Indyk, P., Motwani, R.: Similarity search in high dimensions via hashing. In: Proceedings of the 25th Very Large Database (VLDB) Conference, pp. 518–529 (1999)Google Scholar
  8. 8.
    Grangier, D., Bengio, S.: A discriminative kernel-based model to rank images from text queries. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 30(8), 1371–1384 (2008)CrossRefGoogle Scholar
  9. 9.
    Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto- annotation. In: International Conference on Computer Vision (2009)Google Scholar
  10. 10.
    Indyk, P.: Stable distributions, pseudorandom generators, embeddings, and data stream computation. J. ACM 53(3), 307–323 (2006)CrossRefMathSciNetGoogle Scholar
  11. 11.
    Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 119–126. ACM, New York (2003)CrossRefGoogle Scholar
  12. 12.
    Jeon, J., Manmatha, R.: Using maximum entropy for automatic image annotation. In: International Conference on Image and Video Retrieval, pp. 2040–2041 (2004)Google Scholar
  13. 13.
    Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: European Conference on Computer Vision (2008)Google Scholar
  14. 14.
    Manocha, S., Girolami, M.A.: An empirical analysis of the probabilistic k-nearest neighbour classifier. Pattern Recognition Letters 28(13), 1818–1824 (2007)CrossRefGoogle Scholar
  15. 15.
    Monay, F., Gatica-Perez, D.: Plsa-based image auto-annotation: constraining the latent space. In: ACM Multimedia, pp. 348–351 (2004)Google Scholar
  16. 16.
    Naphade, M.: On supervision and statistical learning for semantic multimedia analysis. Journal of Visual Communication and Image Representation 15(3) (2004)Google Scholar
  17. 17.
    Pan, J.Y., Yang, H.J., Duygulu, P., Faloutsos, C.: Automatic image captioning. In: International Conference on Multimedia and Expo, ICME (2004)Google Scholar
  18. 18.
    Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: ICCV (2003)Google Scholar
  19. 19.
    Vogel, J., Schiele, B.: Natural scene retrieval based on a semantic modeling step. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 207–215. Springer, Heidelberg (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Roberto Paredes
    • 1
  1. 1.ITI-UPVValencia(Spain)

Personalised recommendations