Multiple Ontology-Based Indexing of Multimedia Documents on the World Wide Web
In order to cope with the growing need to search multimedia documents with precision on the Web, we propose a multimedia conceptual indexing framework incorporating semantic relations between annotation words. To do this, we utilize our DOM Tree-based Webpage segmentation algorithm to automatically extract surrounding textual information of the multimedia documents in Webpages. Next, we employ knowledge represented in multiple ontologies to discover the latent semantic dimensions of the surrounding textual information. As a consequence, indexes (represented as semantic networks) are constructed where nodes of each network capture words that exist in the ontologies and edges represent the semantic relations that hold between those words. To address the semantic heterogeneity problem between the produced networks, we employ a multi-level merging algorithm that combines heterogeneous networks into a more coherent network. Additionally, we utilize concept-relatedness measures to address the issue of unrecognized entities by the ontologies. We evaluate the techniques of the proposed framework using three different multimedia dataset types. Experimental results indicate that the proposed techniques are effective and precise.
KeywordsMultimedia indexing Webpage segmentation Ontology
- 1.Amato, F., et al.: Content-based multimedia retrieval. In: Colace, F., et al. (eds.) Data Management in Pervasive Systems, pp. 291–310. Springer International Publishing (2015)Google Scholar
- 2.Wattanarachothai, W., Patanukhom, K.: Key frame extraction for text based video retrieval using Maximally Stable Extremal Regions. In: Industrial Networks and Intelligent Systems (INISCom), vol. 2, no. 4, pp. 29–37, Mar 2015Google Scholar
- 4.Gao, Y., Wang, M., Zha, Z.J., Shen, J.L.: Visual-textual joint relevance learning for tag-based social image search. IEEE Trans. Image Process. 22(1), 363–376 (2013)Google Scholar
- 6.Popescu, A., Moëllic, P., Millet, C.: SemRetriev—an ontology driven image retrieval system. In: CIVR, Amsterdam, The Netherlands (2007)Google Scholar
- 7.Manzoor, U., Ejaz, N., Akhtar, N.: Ontology based image retrieval. In: Proceedings of the International Conference for Internet Technology and Secured Transactions, pp. 288–293 (2012)Google Scholar
- 8.Wang, H., Chia, L., Gao, S.: Wikipedia-assisted concept thesaurus for better web media understanding. In: MIR10. Pennsylvania, USA, pp. 349–358 (2010)Google Scholar
- 9.Fauzi, F., Hong, J., Belkhatir, M., Hong, D.: Webpage segmentation for extracting images and their surrounding contextual information. In: ACM Multimedia’09, Beijing, China, pp. 649–652 (2009)Google Scholar
- 10.Maree, M., Belkhatir, M.: A Coupled statistical/semantic framework for merging heterogeneous domain-specific ontologies. In: 22nd International Conference on Tools with Artificial Intelligence (ICTAI’10), Arras, France, pp. 159–166 (2010)Google Scholar
- 12.Miller, G.A.: WordNet: A lexical database for English. Commun. ACM 409–409 (1995)Google Scholar
- 13.Fabian, M.S., Gjergji, K., Gerhard, W.: YAGO: a core of semantic knowledge unifying WordNet and wikipedia. In: Proceedings of the 16th International World Wide Web Conference, WWW, pp. 697–706 (2007)Google Scholar
- 14.Suchanek, M.F., Sozio, M., Weikum, G.: SOFIE: a self-organizing framework for information extraction. In: WWW09, pp. 631–640 (2009)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.