A Comparative Study of Key Phrase Extraction for Cross-Domain Document Collections
An extraction tool, nowadays, has become useful for text mining researchers to find keywords and keyphrases from the documents. Performing keywords and keyphrases extraction for cross-domain information are more challenging since both domains of interest are different in word usage. In this paper, two popular keyphrases extraction tools, Maui and Carrot, are investigated, for extracting terms from cross-domain document databases. The characteristic of keywords or phrases matching among different domain collections is presented and used for determining the keyphrase extraction tool for patent documents and scientific publications. In our experiment, matching between a patent and its cited publication are the key point. For evaluation, the performance of cross-domain matching is measured by comparing the similarity measure among those extraction tool results. The experimental results show that Maui tool proves to be the appropriate keyphrases extraction tool with its best performance measured by Cosine similarity of 3.31% when compared with Carrot tool for cross-domain document collections matching.
KeywordsKeyphrase extraction tools Cross-domain document collection Patent Publication Similarity measures Maui Carrot
Unable to display preview. Download preview PDF.
- 2.Kaur, B. and Sidhu, B.: Methods for key phrase extraction from documents. In: Technological Research in Engineering (IJTRE) (2014) Google Scholar
- 3.Medelyan, O., Witten, I.H.: Thesaurus based automatic keyphrase indexing. In: JCDL 2006 (2006)Google Scholar
- 5.Medelyan, O.: Human-competitive automatic topic indexing. In: PhD thesis, University of Waikato, New Zealand (2009)Google Scholar
- 6.Medelyan, O., Frank, E., Witten, I.: Human-competitive tagging using automatic keyphrase extraction. In: Empirical Methods in Natural Language Processing, pp. 1318–1327 (2009)Google Scholar
- 8.Stefanowski, J., Weiss, D.: Carrot2 and language properties in Web search results clustering. In: 1st International Atlantic Web Intelligence Conference. Lecture Notes in Computer Science, pp. 240–249 (2003)Google Scholar
- 9.Verma, M., Varma, V.: Applying key phrase extraction to aid invalidity search. In: 13th International Conference on Artificial Intelligence and Law, pp. 249–255 (2011)Google Scholar
- 10.Witten, I., Paynter, G., Frank, E., Gutwin, C., Nevill-Manning, C.: Kea: Practical automatic keyphrase extraction. In: 4th ACM conference on Digital Libraries, pp. 254–255 (1999)Google Scholar