Combining Visual and Textual Modalities for Multimedia Ontology Matching

  • Nicolas James
  • Konstantin Todorov
  • Céline Hudelot
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6725)


Multimedia search and retrieval are considerably improved by providing explicit meaning to visual content by the help of ontologies. Several multimedia ontologies have been proposed recently as suitable knowledge models to narrow the well known semantic gap and to enable the semantic interpretation of images. Since these ontologies have been created in different application contexts, establishing links between them, a task known as ontology matching, promises to fully unlock their potential in support of multimedia search and retrieval. This paper proposes and compares empirically two extensional ontology matching techniques applied to an important semantic image retrieval issue: automatically associating common-sense knowledge to multimedia concepts. First, we extend a previously introduced matching approach to use both textual and visual knowledge. In addition, a novel matching technique based on a multimodal graph is proposed. We argue that the textual and visual modalities have to be seen as complementary rather than as exclusive means to improve the efficiency of the application of an ontology matching procedure in the multimedia domain. An experimental evaluation is included.


Image Annotation Word Sense Disambiguation Multimedia Document Textual Modality Visual Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Athanasiadis, T., Tzouvaras, V., Petridis, K., Precioso, F., Avrithis, Y., Kompatsiaris, Y.: Using a multimedia ontology infrastructure for semantic annotation of multimedia content. In: SemAnnot 2005 (2005)Google Scholar
  2. 2.
    Dasiopoulou, S., Kompatsiaris, I., Strintzis, M.: Using fuzzy dls to enhance semantic image analysis. In: Semantic Multimedia, pp. 31–46. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  3. 3.
    Dasiopoulou, S., Tzouvaras, V., Kompatsiaris, I., Strintzis, M.: Enquiring MPEG-7 based multimedia ontologies. In: MM Tools and Appls., pp. 1–40 (2010)Google Scholar
  4. 4.
    Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: CVPR, pp. 710–719 (2009)Google Scholar
  5. 5.
    Euzenat, J., Shvaiko, P.: Ontology Matching, 1st edn. Springer, Heidelberg (2007)zbMATHGoogle Scholar
  6. 6.
    Fan, J., Luo, H., Shen, Y., Yang, C.: Integrating visual and semantic contexts for topic network generation and word sense disambiguation. In: ACM CIVR 2009, pp. 1–8 (2009)Google Scholar
  7. 7.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. JMLR 3(1), 1157–1182 (2003)zbMATHGoogle Scholar
  8. 8.
    Haveliwala, T.: Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 784–796 (2003)Google Scholar
  9. 9.
    Hudelot, C., Atif, J., Bloch, I.: Fuzzy Spatial Relation Ontology for Image Interpretation. Fuzzy Sets and Systems 159, 1929–1951 (2008)CrossRefGoogle Scholar
  10. 10.
    Hudelot, C., Maillot, N., Thonnat, M.: Symbol grounding for semantic image interpretation: from image data to semantics. In: SKCV-Workshop, ICCV (2005)Google Scholar
  11. 11.
    Inoue, M.: On the need for annotation-based image retrieval. In: Proceedings of the Workshop on Information Retrieval in Context (IRiX), Sheffield, UK, pp. 44–46 (2004)Google Scholar
  12. 12.
    James, N., Todorov, K., Hudelot, C.: Ontology matching for the semantic annotation of images. In: FUZZ-IEEE. IEEE Computer Society Press, Los Alamitos (2010)Google Scholar
  13. 13.
    Koskela, M., Smeaton, A.: An empirical study of inter-concept similarities in multimedia ontologies. In: CIVR 2007, pp. 464–471. ACM, New York (2007)Google Scholar
  14. 14.
    Mihalcea, R., Tarau, P., Figa, E.: Pagerank on semantic networks, with application to word sense disambiguation. In: ICCL, p. 1126. Association for Computational Linguistics (2004)Google Scholar
  15. 15.
    Miller, G.: WordNet: a lexical database for English. Communications of the ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  16. 16.
    Pan, J., Yang, H., Faloutsos, C., Duygulu, P.: Automatic multimedia cross-modal correlation discovery. In: ACM SIGKDD, p. 658. ACM, New York (2004)Google Scholar
  17. 17.
    Peraldi, I.S.E., Kaya, A., Möller, R.: Formalizing multimedia interpretation based on abduction over description logic aboxes. In: Description Logics (2009)Google Scholar
  18. 18.
    Russell, B., Torralba, A., Murphy, K., Freeman, W.: LabelMe: a database and web-based tool for image annotation. IJCV 77(1), 157–173 (2008)CrossRefGoogle Scholar
  19. 19.
    Smeulders, A., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Patt. An. Mach. Intell., 1349–1380 (2000)Google Scholar
  20. 20.
    Smith, J., Chang, S.: Large-scale concept ontology for multimedia. IEEE Multimedia 13(3), 86–91 (2006)CrossRefGoogle Scholar
  21. 21.
    Snoek, C., Huurnink, B., Hollink, L., De Rijke, M., Schreiber, G., Worring, M.: Adding semantics to detectors for video retrieval. IEEE Trans. on Mult. 9(5), 975–986 (2007)CrossRefGoogle Scholar
  22. 22.
    Tansley, R.: The multimedia thesaurus: An aid for multimedia information retrieval and navigation. Master’s thesis (1998)Google Scholar
  23. 23.
    Todorov, K., Geibel, P., Kühnberger, K.-U.: Extensional ontology matching with variable selection for support vector machines. In: CISIS, pp. 962–968. IEEE Computer Society Press, Los Alamitos (2010)Google Scholar
  24. 24.
    Tong, H., Faloutsos, C., Pan, J.-Y.: Fast random walk with restart and its applications. In: ICDM 2006, pp. 613–622. IEEE Computer Society, Washington, DC (2006)Google Scholar
  25. 25.
    Wang, C., Jing, F., Zhang, L., Zhang, H.: Image annotation refinement using random walk with restarts. In: ACM MM, p. 650 (2006)Google Scholar
  26. 26.
    Wu, L., Hua, X.-S., Yu, N., Ma, W.-Y., Li, S.: Flickr distance. In: MM 2008, pp. 31–40. ACM, New York (2008)Google Scholar
  27. 27.
    Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Fourteenth ICML, pp. 412–420. Morgan Kaufmann Publishers, San Francisco (1997)Google Scholar
  28. 28.
    Yao, B., Yang, X., Lin, L., Lee, M., Zhu, S.: I2t: Image parsing to text description. IEEE Proc. Special Issue on Internet Vision (to appear)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Nicolas James
    • 1
  • Konstantin Todorov
    • 1
  • Céline Hudelot
    • 1
  1. 1.MAS Laboratory, École Centrale ParisChâtenay-MalabryFrance

Personalised recommendations