Abstract
Keywords play a vital role in extracting relevant and semantically related documents from a huge collection of various documents. Keywords represent the main topics covered in the document. But manual keyword extraction is a tedious and time-consuming process. Thus, there is a need for an automated keyword extraction system for easier extraction of relevant documents. In this paper, the focus is on agriculture-related documents. Agrovoc, an agriculture-based vocabulary that contains more than 35,000 concepts is used for extracting relevant keywords from the document. The proposed system extracts the relevant keywords from agricultural documents which are further used to extract relevant documents. Also, with the increasing number of documents on the Internet, the need for efficient storage for keywords with their corresponding documents is necessary. In the proposed system, a trie-based inverted index has been used for efficient storage and retrieval of keywords and the related documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Balaji, V., et al.: Agrotags – a tagging scheme for agricultural digital objects. In: Sánchez-Alonso, S., Athanasiadis, Ioannis N. (eds.) MTSR 2010. CCIS, vol. 108, pp. 36–45. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16552-8_4
Cutting, D., Pedersen, O.: Optimizations for dynamic inverted index maintenance. In: Proceedings of the 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 405–411, January 1990
Sun, H.F., Hou, W.: Study on the improvement of TFIDF algorithm in data mining. In: Advanced Materials Research, vol. 1042, pp. 106–109 (2014)
Zaware, P.S., Todmal, S.R.: Inverted indexing mechanism for search engine. Int. J. Comput. Appl. 123, 15–19 (2015)
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13, 157–169 (2003)
Siddiqi, S., Sharan, A.: Keyword and keyphrase extraction techniques: a literature review. Int. J. Comput. Appl. 109, 18–23 (2015)
Joshi, P., Chaudhary, S., Kumar, V.: Information extraction from social network for agro-produce marketing. In International Conference on Communication Systems and Network Technologies, Rajkot, pp. 941–944 (2012)
Luthra, S., Arora, D., Mittal, K., Chhabra, A.: A statistical approach of keyword extraction for efficient retrieval. Int. J. Comput. Appl. 168, 31–36 (2017)
Terrovitis, M., Passas, S., Vassiliadis, P., Sellis, T.: A combination of trie-trees and inverted files for the indexing of set-valued attributes. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management, pp. 728–737 (2006)
Balcerzak, B., Jaworski, W., Wierzbicki, A.: Application of TextRank algorithm for credibility assessment. In: IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Warsaw, pp. 451–454 (2014)
Yen, S.-F., Chen, J.-J., Tsai, Y.-H.: Efficient cloud image retrieval system using weighted-inverted index and database filtering algorithms. J. Electron. Sci. Technol. 15(2), 161–168 (2017)
Rezaei, M., Gali, N., Fränti, P.: ClRank.: a method for keyword extraction from web pages using clustering and distribution of nouns. In: IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), Singapore, pp. 79–84 (2015)
AIMS AGROVOC. http://aims.fao.org/vest-registry/vocabularies/agrovoc. Accessed 30 Jan 2019
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Johnny, S., Jaya Nirmala, S. (2019). Key Phrase Extraction System for Agricultural Documents. In: Gani, A., Das, P., Kharb, L., Chahal, D. (eds) Information, Communication and Computing Technology. ICICCT 2019. Communications in Computer and Information Science, vol 1025. Springer, Singapore. https://doi.org/10.1007/978-981-15-1384-8_20
Download citation
DOI: https://doi.org/10.1007/978-981-15-1384-8_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1383-1
Online ISBN: 978-981-15-1384-8
eBook Packages: Computer ScienceComputer Science (R0)