Automatic Metadata Harvesting from Digital Content Using NLP

Doshi, Rushabh D; Sidpara, Chintan B; Khimani, Kunal U

doi:10.1007/978-3-319-30933-0_48

Rushabh D Doshi⁵,
Chintan B Sidpara⁵ &
Kunal U Khimani⁵

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 50))

952 Accesses

Abstract

Metadata Harvestings is one of the prime research fields in information retrieval. Metadata is used to references information resources. Metadata play an significant role in describing and searching document. In early stages of metadata harvesting was manually. Later on automatic metadata harvesting techniques were invented; still they are human intensive since they require expert decision to identify relevant metadata also this is time consuming. Also automatic metadata harvesting techniques are developed but mostly works with structured format. We proposed a new approach to harvesting metadata from document using NLP. As NLP stands for Natural Language Processing work on natural language that human used in day today life.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic Extraction of Metadata Based on Natural Language Processing for Research Documents in Institutional Repositories

NLIDB Systems for Enterprise Databases: A Metadata Based Approach

A Natural Language Based Approach to Generate Document Stores

References

Manning, C.D., Raghavan, P., Schtze H.: An introduction to information retrieval book (2008)
Google Scholar
Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM J. Res. Dev. 1(4), 309–317 (1957)
Article MathSciNet Google Scholar
Salton, G., Yang, C.S., Yu, C.T.: A theory of term importance in automatic text analysis. J. of the Am. Soc. for Inf. Sci. 26(1), 33–44 (1975)
Article Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-ocuurrence statistical information. Int. J. on Artif. Intell. Tools. 13(1), 157–169 (2004)
Google Scholar
Gao, Y., Liu, J.: Peixun ma the hot keyphrase extraction based on TF*PDF. In: IEEE conference (2011)
Google Scholar
Wang, C., Zhang, M., Ru, L., Ma S.: An automatic online news topic keyphrase extraction system. In: IEEE conference (2006)
Google Scholar
Yahaya, N. A., Buang R.: Automated metadata extraction from web sources. In: IEEE conference (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

V.V.P Engineering College, Rajkot, Gujarat, India
Rushabh D Doshi, Chintan B Sidpara & Kunal U Khimani

Authors

Rushabh D Doshi
View author publications
You can also search for this author in PubMed Google Scholar
Chintan B Sidpara
View author publications
You can also search for this author in PubMed Google Scholar
Kunal U Khimani
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rushabh D Doshi .

Editor information

Editors and Affiliations

Anil Neerukonda Ins. of Tech. & Sci., Visakhapatnam, Andhra Pradesh, India
Suresh Chandra Satapathy
Indian Statistical Institute, Jadavpur University, Kolkata, India
Swagatam Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Doshi, R.D., Sidpara, C.B., Khimani, K.U. (2016). Automatic Metadata Harvesting from Digital Content Using NLP. In: Satapathy, S., Das, S. (eds) Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1. Smart Innovation, Systems and Technologies, vol 50. Springer, Cham. https://doi.org/10.1007/978-3-319-30933-0_48

Download citation

DOI: https://doi.org/10.1007/978-3-319-30933-0_48
Published: 01 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30932-3
Online ISBN: 978-3-319-30933-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Automatic Metadata Harvesting from Digital Content Using NLP

Abstract

Access this chapter

Similar content being viewed by others

Automatic Extraction of Metadata Based on Natural Language Processing for Research Documents in Institutional Repositories

NLIDB Systems for Enterprise Databases: A Metadata Based Approach

A Natural Language Based Approach to Generate Document Stores

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatic Metadata Harvesting from Digital Content Using NLP

Abstract

Access this chapter

Similar content being viewed by others

Automatic Extraction of Metadata Based on Natural Language Processing for Research Documents in Institutional Repositories

NLIDB Systems for Enterprise Databases: A Metadata Based Approach

A Natural Language Based Approach to Generate Document Stores

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation