A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance

Gulati, Payal; Yadav, Manisha

doi:10.1007/978-981-10-8848-3_17

Payal Gulati¹⁸ &
Manisha Yadav¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 731))

2334 Accesses
1 Citations

Abstract

The World Wide Web today comprises of billions of Web documents with information on varied topics presented by different types of media such as text, images, audio, and video. Therefore along with textual information, the number of images over WWW is exponentially growing. As compared to text, the annotation of images by its semantics is more complicated as there is a lack of correlation between user’s semantics and computer system’s low-level features. Moreover, the Web pages are generally composed of contents containing multiple topics and the context relevant to the image on the Web page makes only a small portion of the full text, leading to the challenge for image search engines to annotate and index Web images. Existing image annotation systems use contextual information from page title, image src tag, alt tag, meta tag, image surrounding text for annotating Web image. Nowadays, some intelligent approaches perform a page segmentation as a preprocessing step. This paper proposes a novel approach for annotating Web images. In this work, Web pages are divided into Web content blocks based on the visual structure of page and thereafter the textual data of Web content blocks which are semantically closer to the blocks containing Web images are extracted. The relevant keywords from textual information along with contextual information of images are used for annotation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sumathi, T., Devasena, C.L., Hemalatha, M.: An overview of automated image annotation approaches. Int. J. Res. Rev. Inf. Sci. 1(1) (2011) (Copyright © Science Academy Publisher, United Kingdom)
Google Scholar
Swain, M., Frankel, C., Athitsos, V.: Webseer: an image search engine for the World Wide Web. In: CVPR (1997)
Google Scholar
Smith, J., Chang, S.: An image and video search engine for the world-wide web. Storage. Retr. Im. Vid. Datab. 8495 (1997)
Google Scholar
Ortega-Binderberger, M., Mehrotra, V., Chakrabarti, K., Porkaew, K.: Webmars: a multimedia search engine. In: SPIE An. Symposium on Electronic Imaging, San Jose, California. Academy Publisher, United Kingdom (2000)
Google Scholar
Alexandre, L., Pereira, M., Madeira, S., Cordeiro, J., Dias, G.: Web image indexing: combining image analysis with text processing. In: Proceedings of the 5th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS04). Publisher, United Kingdom (2004)
Google Scholar
Yadav, M., Gulati, P.: A novel approach for extracting relevant keywords for web image annotation using semantics. In: 9th International Conference on ASEICT (2015)
Google Scholar
Coelho, T.A.S., Calado, P.P., Souza, L.V., Ribeiro-Neto, B., Muntz, R.: Image retrieval using multiple evidence ranking. IEEE Trans. Knowl. Data Eng. 16(4), 408–417 (2004)
Article Google Scholar
Pan, L.: Image 8: an image search engine for the internet. Honours Year Project Report, School of Computing, National University of Singapore, April, 2003
Google Scholar
Liu, B.: Web data mining: exploring hyperlinks, contents, and usage data. Data-Centric Syst. Appl. Springer 2007 16(4), 408–417 (2004)
Google Scholar
Fauzi, F., Hong, J., Belkhatir, M.: Webpage segmentation for extracting images and their surrounding contextual information. In: ACM Multimedia, pp. 649–652 (2009)
Google Scholar
Chakrabarti, D., Kumar, R., Punera, K.: A graphtheoretic approach to webpage segmentation. In: Proceeding of the 17th International Conference on World Wide Web, WWW’08, pp. 377–386, New York, USA (2008)
Google Scholar
Cai, D., Yu, S., Wen, J.R., Ma, W.Y.: VIPS: a vision based page segmentation algorithm. Technical Report, Microsoft Research (MSR-TR-2003-79) (2003)
Google Scholar
Hattori, G., Hoashi, K., Matsumoto, K., Sugaya, F.: Robust web page segmentation for mobile terminal using content distances and page layout information. In: Proceedings of the 16th International Conference on World Wide Web, WWW’07, pp. 361–370, New York, NY, USA. ACM (2007)
Google Scholar
Nguyen, H.A., Eng, B.: New semantic similarity techniques of concepts applied in the Biomedical domain and wordnet. Master thesis, The University of Houston-Clear Lake (2006)
Google Scholar
Voorhees, E.: Using WordNet to disambiguate word senses for text retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference (1993)
Google Scholar
Landauer, T.K., Foltz, P., Laham, D.: Introduction to latent semantic analysis. Discourse Processes 25 (1998)
Article Google Scholar
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: WordNet: An on-line lexical database. Int. J. Lexicogr. 3, 235–244 (1990)
Article Google Scholar
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing’03, pp. 241–257. Springer, Berlin, Heidelberg (2003)
Chapter Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 2, ACL-36, pp. 768–774, Morristown, NJ, USA. Association for Computational Linguistics (1998); Sparck Jones, K.: A Statistical Interpretation of Term Specificity and Its Application in Retrieval, pp. 132–142. Taylor Graham Publishing, London, UK (1988)
Google Scholar
Corley, C., Mihalcea, R.: Measuring the semantic similarity of texts. In: Proceedings of the ACL Workshop on Empirical Modeling of Semantic Equivalence and Entailment, EMSEE’05, pp. 13–18, Morristown, NJ, USA, 2005. Association for Computational Linguistics (1998)
Google Scholar
Tryfou, G., Tsapatsoulis, N.: Image Indexing Based on Web Page Segmentation and Clustering (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

YMCA UST, Faridabad, Haryana, India
Payal Gulati
RPSGOI, Mahendergarh, Haryana, India
Manisha Yadav

Authors

Payal Gulati
View author publications
You can also search for this author in PubMed Google Scholar
Manisha Yadav
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manisha Yadav .

Editor information

Editors and Affiliations

Bharati Vidyapeeth’s Institute of Computer Applications and Management (BVICAM), New Delhi, Delhi, India
M. N. Hoda
Department of Computer Engineering, YMCAUST, Faridabad, Haryana, India
Naresh Chauhan
Department of Computer Science, University of Kashmir, Srinagar, Jammu and Kashmir, India
S. M. K. Quadri
Department of Information Technology and Systems, Indian Institute of Management Rohtak, Rohtak, Haryana, India
Praveen Ranjan Srivastava

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gulati, P., Yadav, M. (2019). A Novel Approach for Extracting Pertinent Keywords for Web Image Annotation Using Semantic Distance and Euclidean Distance. In: Hoda, M., Chauhan, N., Quadri, S., Srivastava, P. (eds) Software Engineering. Advances in Intelligent Systems and Computing, vol 731. Springer, Singapore. https://doi.org/10.1007/978-981-10-8848-3_17

Download citation

DOI: https://doi.org/10.1007/978-981-10-8848-3_17
Published: 13 June 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8847-6
Online ISBN: 978-981-10-8848-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics