Key Phrase Extraction System for Agricultural Documents

Johnny, Swapna; Jaya Nirmala, S.

doi:10.1007/978-981-15-1384-8_20

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1025))

Included in the following conference series:

International Conference on Information, Communication and Computing Technology

464 Accesses
2 Citations

Abstract

Keywords play a vital role in extracting relevant and semantically related documents from a huge collection of various documents. Keywords represent the main topics covered in the document. But manual keyword extraction is a tedious and time-consuming process. Thus, there is a need for an automated keyword extraction system for easier extraction of relevant documents. In this paper, the focus is on agriculture-related documents. Agrovoc, an agriculture-based vocabulary that contains more than 35,000 concepts is used for extracting relevant keywords from the document. The proposed system extracts the relevant keywords from agricultural documents which are further used to extract relevant documents. Also, with the increasing number of documents on the Internet, the need for efficient storage for keywords with their corresponding documents is necessary. In the proposed system, a trie-based inverted index has been used for efficient storage and retrieval of keywords and the related documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Balaji, V., et al.: Agrotags – a tagging scheme for agricultural digital objects. In: Sánchez-Alonso, S., Athanasiadis, Ioannis N. (eds.) MTSR 2010. CCIS, vol. 108, pp. 36–45. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16552-8_4
Chapter Google Scholar
Cutting, D., Pedersen, O.: Optimizations for dynamic inverted index maintenance. In: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 405–411, January 1990
Google Scholar
Sun, H.F., Hou, W.: Study on the improvement of TFIDF algorithm in data mining. In: Advanced Materials Research, vol. 1042, pp. 106–109 (2014)
Article Google Scholar
Zaware, P.S., Todmal, S.R.: Inverted indexing mechanism for search engine. Int. J. Comput. Appl. 123, 15–19 (2015)
Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13, 157–169 (2003)
Article Google Scholar
Siddiqi, S., Sharan, A.: Keyword and keyphrase extraction techniques: a literature review. Int. J. Comput. Appl. 109, 18–23 (2015)
Google Scholar
Joshi, P., Chaudhary, S., Kumar, V.: Information extraction from social network for agro-produce marketing. In International Conference on Communication Systems and Network Technologies, Rajkot, pp. 941–944 (2012)
Google Scholar
Luthra, S., Arora, D., Mittal, K., Chhabra, A.: A statistical approach of keyword extraction for efficient retrieval. Int. J. Comput. Appl. 168, 31–36 (2017)
Google Scholar
Terrovitis, M., Passas, S., Vassiliadis, P., Sellis, T.: A combination of trie-trees and inverted files for the indexing of set-valued attributes. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 728–737 (2006)
Google Scholar
Balcerzak, B., Jaworski, W., Wierzbicki, A.: Application of TextRank algorithm for credibility assessment. In: IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Warsaw, pp. 451–454 (2014)
Google Scholar
Yen, S.-F., Chen, J.-J., Tsai, Y.-H.: Efficient cloud image retrieval system using weighted-inverted index and database filtering algorithms. J. Electron. Sci. Technol. 15(2), 161–168 (2017)
Google Scholar
Rezaei, M., Gali, N., Fränti, P.: ClRank.: a method for keyword extraction from web pages using clustering and distribution of nouns. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Singapore, pp. 79–84 (2015)
Google Scholar
AIMS AGROVOC. http://aims.fao.org/vest-registry/vocabularies/agrovoc. Accessed 30 Jan 2019

Download references

Author information

Authors and Affiliations

National Institute of Technology, Trichy, India
Swapna Johnny & S. Jaya Nirmala

Authors

Swapna Johnny
View author publications
You can also search for this author in PubMed Google Scholar
S. Jaya Nirmala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Swapna Johnny or S. Jaya Nirmala .

Editor information

Editors and Affiliations

University of Malaya, Kuala Lumpur, Malaysia
Abdullah Bin Gani
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India
Pradip Kumar Das
Department of IT, Jagan Institute of Management Studies, New Delhi, India
Latika Kharb
Department of IT, Jagan Institute of Management Studies, New Delhi, India
Deepak Chahal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Johnny, S., Jaya Nirmala, S. (2019). Key Phrase Extraction System for Agricultural Documents. In: Gani, A., Das, P., Kharb, L., Chahal, D. (eds) Information, Communication and Computing Technology. ICICCT 2019. Communications in Computer and Information Science, vol 1025. Springer, Singapore. https://doi.org/10.1007/978-981-15-1384-8_20

Download citation

DOI: https://doi.org/10.1007/978-981-15-1384-8_20
Published: 13 November 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1383-1
Online ISBN: 978-981-15-1384-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics