Joint Image and Word Sense Discrimination for Image Retrieval

  • Aurelien Lucchi
  • Jason Weston
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7572)


We study the task of learning to rank images given a text query, a problem that is complicated by the issue of multiple senses. That is, the senses of interest are typically the visually distinct concepts that a user wishes to retrieve. In this paper, we propose to learn a ranking function that optimizes the ranking cost of interest and simultaneously discovers the disambiguated senses of the query that are optimal for the supervised task. Note that no supervised information is given about the senses. Experiments performed on web images and the ImageNet dataset show that using our approach leads to a clear gain in performance.


Image Retrieval Ranking Function Image Annotation Word Sense Word Sense Disambiguation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grangier, D., Bengio, S.: A discriminative kernel-based model to rank images from text queries. PAMI 30, 1371–1384 (2008)CrossRefGoogle Scholar
  2. 2.
    Boser, B.E., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: COLT, pp. 144–152 (1992)Google Scholar
  3. 3.
    Monay, F., Gatica-Perez, D.: On image auto-annotation with latent space models. In: ICMR, pp. 275–278 (2003)Google Scholar
  4. 4.
    Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: SIGIR (2003)Google Scholar
  5. 5.
    Makadia, A., Pavlovic, V., Kumar, S.: A New Baseline for Image Annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation. In: ICCV, pp. 309–316 (2009)Google Scholar
  7. 7.
    Grangier, D., Bengio, S.: A Neural Network to Retrieve Images from Text Queries. In: Kollias, S.D., Stafylopatis, A., Duch, W., Oja, E. (eds.) ICANN 2006, Part II. LNCS, vol. 4132, pp. 24–34. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  8. 8.
    Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38, 39–41 (1995)CrossRefGoogle Scholar
  9. 9.
    Pedersen, T., Bruce, R.: Distinguishing word senses in untagged text. In: EMNLP, vol. 2, pp. 197–207 (1997)Google Scholar
  10. 10.
    Purandare, A., Pedersen, T.: Word sense discrimination by clustering contexts in vector and similarity spaces. In: CoNLL, pp. 41–48 (2004)Google Scholar
  11. 11.
    Basile, P., Caputo, A., Semeraro, G.: Exploiting disambiguation and discrimination in information retrieval systems. In: WI/IAT Workshops, pp. 539–542 (2009)Google Scholar
  12. 12.
    Agirre, E., Edmonds, P.: Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology), 1st edn. Springer (2007)Google Scholar
  13. 13.
    Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys (CSUR) 41, 10 (2009)CrossRefGoogle Scholar
  14. 14.
    Berg, T.L., Forsyth, D.A.: Animals on the web. In: CVPR (2006)Google Scholar
  15. 15.
    Schroff, F., Criminisi, A., Zisserman, A.: Harvesting Image Databases from the Web. PAMI 33, 754–766 (2011)CrossRefGoogle Scholar
  16. 16.
    Loeff, N., Alm, C., Forsyth, D.: Discriminating image senses by clustering with multimodal features. In: ACL, pp. 547–554 (2006)Google Scholar
  17. 17.
    Saenko, K., Darrell, T.: Filtering abstract senses from image search results. In: NIPS, pp. 1589–1597 (2009)Google Scholar
  18. 18.
    Wan, K.W., Tan, A.H., Lim, J.H., Chia, L.T., Roy, S.: A latent model for visual disambiguation of keyword-based image search. In: BMVC (2009)Google Scholar
  19. 19.
    Chang, Y.-C., Chen, H.-H.: Image Sense Classification in Text-Based Image Retrieval. In: Lee, G.G., Song, D., Lin, C.-Y., Aizawa, A., Kuriyama, K., Yoshioka, M., Sakai, T. (eds.) AIRS 2009. LNCS, vol. 5839, pp. 124–135. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Barnard, K., Johnson, M.: Word sense disambiguation with pictures. Artif. Intell. 167, 13–30 (2005)CrossRefGoogle Scholar
  21. 21.
    Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: ICML (2009)Google Scholar
  22. 22.
    Bergeron, C., Zaretzki, J., Breneman, C., Bennett, K.P.: Multiple instance ranking. In: ICML (2008)Google Scholar
  23. 23.
    Boyd, S., Mutapcic, A.: Subgradient methods. notes for ee364b, Stanford university (2007)Google Scholar
  24. 24.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009)Google Scholar
  25. 25.
    Weston, J., Bengio, S., Usunier, N.: Wsabie: Scaling up to large vocabulary image annotation. In: IJCAI, pp. 2764–2770 (2011)Google Scholar
  26. 26.
    Grauman, K., Trevor, D.: The pyramid match kernel: Efficient learning with sets of features. JMLR 8, 725–760 (2007)zbMATHGoogle Scholar
  27. 27.
    Leung, T., Malik, J.: Representing and Recognizing the Visual Appearance of Materials Using Three-Dimensional Textons. IJCV 43, 29–44 (2001)zbMATHCrossRefGoogle Scholar
  28. 28.
    Schoelkopf, B., Smola, A., Müller, K.R.: Kernel principal component analysis. In: Advances in Kernel Methods - Support Vector Learning, pp. 327–352. MIT Press (1999)Google Scholar
  29. 29.
    Barla, A., Odone, F., Verri, A.: Histogram intersection kernel for image classification. In: ICIP, pp. 513–516 (2003)Google Scholar
  30. 30.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. JMLR 2, 265–292 (2001)Google Scholar
  31. 31.
    Zien, A., De Bona, F., Ong, C.S.: Training and approximation of a primal multiclass support vector machine. In: ASMDA (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Aurelien Lucchi
    • 1
    • 2
  • Jason Weston
    • 1
  1. 1.GoogleNew YorkUSA
  2. 2.EPFLLausanneSwitzerland

Personalised recommendations