Impact of Visual Information on Text and Content Based Image Retrieval

  • Christophe Moulin
  • Christine Largeron
  • Mathias Géry
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6218)


Nowadays, multimedia documents composed of text and images are increasingly used, thanks to the Internet and the increasing capacity of data storage. It is more and more important to be able to retrieve needles in this huge haystack. In this paper, we present a multimedia document model which combines textual and visual information. Using a bag-of-words approach, it represents a textual and visual document using a vector for each modality. Given a multimedia query, our model combines scores obtained for each modality and returns a list of relevant retrieved documents. This paper aims at studying the influence of the weight given to the visual information relative to the textual information. Experiments on the multimedia ImageCLEF collection show that results can be improved by learning this weight parameter.


Visual Information Image Retrieval Visual Word Textual Information Local Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(12), 1349–1380 (2000)CrossRefGoogle Scholar
  2. 2.
    Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications 2(1), 1–19 (2006)CrossRefGoogle Scholar
  3. 3.
    Flickner, M., Sawhney, H.S., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., Yanker, P.: Query by image and video content: The QBIC system. IEEE Computer 28(9), 23–32 (1995)Google Scholar
  4. 4.
    Cox, I.J., Miller, M.L., Minka, T.P., Papathomas, T.V., Yianilos, P.N.: The bayesian image retrieval system, PicHunter: theory, implementation, and psychophysical experiments. IEEE Transactions on Image Processing 9(1), 20–37 (2000)CrossRefGoogle Scholar
  5. 5.
    Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D.M., Jordan, M.I.: Matching words and pictures. The Journal of Machine Learning Research 3, 1107–1135 (2003)zbMATHCrossRefGoogle Scholar
  6. 6.
    Datta, R., Joshi, D., Li, J., Wang, J.Z.: Image retrieval: Ideas, influences, and trends of the new age. ACM Computing Surveys 40(2) (2008)Google Scholar
  7. 7.
    Snoek, C.G.M., Worring, M., Gemert, J.C.V., Mark Geusebroek, J., Smeulders, A.W.M.: The challenge problem for automated detection of 101 semantic concepts in multimedia. In: ACM Conference on Multimedia, pp. 421–430 (2006)Google Scholar
  8. 8.
    Tollari, S., Detyniecki, M., Marsala, C., Fakeri-Tabrizi, A., Amini, M.R., Gallinari, P.: Exploiting visual concepts to improve text-based image retrieval. In: European Conference on Information Retrieval, ECIR (2009)Google Scholar
  9. 9.
    Moulin, C., Barat, C., Géry, M., Ducottet, C., Largeron, C.: UJM at ImageCLEFwiki 2008. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) Evaluating Systems for Multilingual and Multimodal Information Access. LNCS, vol. 5706, pp. 779–786. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  10. 10.
    Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communations of the ACM 18(11), 613–620 (1975)zbMATHCrossRefGoogle Scholar
  11. 11.
    Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M.: Okapi at trec-3. In: Text REtrieval Conference, pp. 21–30 (1994)Google Scholar
  12. 12.
    Zhai, C.: Notes on the lemur TFIDF model. Technical report, Carnegie Mellon University (2001)Google Scholar
  13. 13.
    Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV 2004 workshop on Statistical Learning in Computer Vision, pp. 59–74 (2004)Google Scholar
  14. 14.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  15. 15.
    Tsikrika, T., Kludas, J.: Overview of the wikipediaMM task at ImageCLEF 2009. In: 10th Workshop of the Cross-Language Evaluation Forum, Corfu, Greece (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Christophe Moulin
    • 1
    • 2
    • 3
  • Christine Largeron
    • 1
    • 2
    • 3
  • Mathias Géry
    • 1
    • 2
    • 3
  1. 1.Université de LyonSaint-ÉtienneFrance
  2. 2.CNRS, UMR 5516, Laboratoire Hubert CurienSaint-ÉtienneFrance
  3. 3.Université de Saint-Étienne, Jean-MonnetSaint-ÉtienneFrance

Personalised recommendations