Abstract
The World Wide Web today comprises of billions of Web documents with information on varied topics presented by different types of media such as text, images, audio, and video. Therefore along with textual information, the number of images over WWW is exponentially growing. As compared to text, the annotation of images by its semantics is more complicated as there is a lack of correlation between user’s semantics and computer system’s low-level features. Moreover, the Web pages are generally composed of contents containing multiple topics and the context relevant to the image on the Web page makes only a small portion of the full text, leading to the challenge for image search engines to annotate and index Web images. Existing image annotation systems use contextual information from page title, image src tag, alt tag, meta tag, image surrounding text for annotating Web image. Nowadays, some intelligent approaches perform a page segmentation as a preprocessing step. This paper proposes a novel approach for annotating Web images. In this work, Web pages are divided into Web content blocks based on the visual structure of page and thereafter the textual data of Web content blocks which are semantically closer to the blocks containing Web images are extracted. The relevant keywords from textual information along with contextual information of images are used for annotation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sumathi, T., Devasena, C.L., Hemalatha, M.: An overview of automated image annotation approaches. Int. J. Res. Rev. Inf. Sci. 1(1) (2011) (Copyright © Science Academy Publisher, United Kingdom)
Swain, M., Frankel, C., Athitsos, V.: Webseer: an image search engine for the World Wide Web. In: CVPR (1997)
Smith, J., Chang, S.: An image and video search engine for the world-wide web. Storage. Retr. Im. Vid. Datab. 8495 (1997)
Ortega-Binderberger, M., Mehrotra, V., Chakrabarti, K., Porkaew, K.: Webmars: a multimedia search engine. In: SPIE An. Symposium on Electronic Imaging, San Jose, California. Academy Publisher, United Kingdom (2000)
Alexandre, L., Pereira, M., Madeira, S., Cordeiro, J., Dias, G.: Web image indexing: combining image analysis with text processing. In: Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS04). Publisher, United Kingdom (2004)
Yadav, M., Gulati, P.: A novel approach for extracting relevant keywords for web image annotation using semantics. In: 9th International Conference on ASEICT (2015)
Coelho, T.A.S., Calado, P.P., Souza, L.V., Ribeiro-Neto, B., Muntz, R.: Image retrieval using multiple evidence ranking. IEEE Trans. Knowl. Data Eng. 16(4), 408–417 (2004)
Pan, L.: Image 8: an image search engine for the internet. Honours Year Project Report, School of Computing, National University of Singapore, April, 2003
Liu, B.: Web data mining: exploring hyperlinks, contents, and usage data. Data-Centric Syst. Appl. Springer 2007 16(4), 408–417 (2004)
Fauzi, F., Hong, J., Belkhatir, M.: Webpage segmentation for extracting images and their surrounding contextual information. In: ACM Multimedia, pp. 649–652 (2009)
Chakrabarti, D., Kumar, R., Punera, K.: A graphtheoretic approach to webpage segmentation. In: Proceeding of the 17th International Conference on World Wide Web, WWW’08, pp. 377–386, New York, USA (2008)
Cai, D., Yu, S., Wen, J.R., Ma, W.Y.: VIPS: a vision based page segmentation algorithm. Technical Report, Microsoft Research (MSR-TR-2003-79) (2003)
Hattori, G., Hoashi, K., Matsumoto, K., Sugaya, F.: Robust web page segmentation for mobile terminal using content distances and page layout information. In: Proceedings of the 16th International Conference on World Wide Web, WWW’07, pp. 361–370, New York, NY, USA. ACM (2007)
Nguyen, H.A., Eng, B.: New semantic similarity techniques of concepts applied in the Biomedical domain and wordnet. Master thesis, The University of Houston-Clear Lake (2006)
Voorhees, E.: Using WordNet to disambiguate word senses for text retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference (1993)
Landauer, T.K., Foltz, P., Laham, D.: Introduction to latent semantic analysis. Discourse Processes 25 (1998)
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. Int. J. Lexicogr. 3, 235–244 (1990)
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing’03, pp. 241–257. Springer, Berlin, Heidelberg (2003)
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 2, ACL-36, pp. 768–774, Morristown, NJ, USA. Association for Computational Linguistics (1998); Sparck Jones, K.: A Statistical Interpretation of Term Specificity and Its Application in Retrieval, pp. 132–142. Taylor Graham Publishing, London, UK (1988)
Corley, C., Mihalcea, R.: Measuring the semantic similarity of texts. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, EMSEE’05, pp. 13–18, Morristown, NJ, USA, 2005. Association for Computational Linguistics (1998)
Tryfou, G., Tsapatsoulis, N.: Image Indexing Based on Web Page Segmentation and Clustering (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Gulati, P., Yadav, M. (2019). A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance. In: Hoda, M., Chauhan, N., Quadri, S., Srivastava, P. (eds) Software Engineering. Advances in Intelligent Systems and Computing, vol 731. Springer, Singapore. https://doi.org/10.1007/978-981-10-8848-3_17
Download citation
DOI: https://doi.org/10.1007/978-981-10-8848-3_17
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8847-6
Online ISBN: 978-981-10-8848-3
eBook Packages: EngineeringEngineering (R0)