Graph-Based Keyword Extraction

Alqaryouti, Omar; Khwileh, Hassan; Farouk, Tarek; Nabhan, Ahmed; Shaalan, Khaled

doi:10.1007/978-3-319-67056-0_9

Omar Alqaryouti⁵,
Hassan Khwileh⁵,
Tarek Farouk⁵,
Ahmed Nabhan^6,7 &
…
Khaled Shaalan^5,8

Part of the book series: Studies in Computational Intelligence ((SCI,volume 740))

3534 Accesses
9 Citations

Abstract

Keyword extraction has gained increasing interest in the era of information explosion. The use of keyword extraction in documents context categorization, indexing and classification has led to the emphasis on graph-based keyword extraction. This research attempts to examine the impact of several factors on the result of using graph-based keyword extraction approach on a scientific dataset. This study applies a new model that processes the Medline scientific abstracts, produces graphs and extracts 3-graphlets and 4-graphlets from those graphs. The focus of the experiment is to come up with a dataset that consists of the keywords and their occurrences in the proposed graphlets patterns for each abstract with its class. Then, apply a supervised Naïve Bayes classifier in order to assign a probability to each word, whether or not it is a keyword, and finally evaluate the performance of the graph-based keyword extraction approach. The model achieved significant results compared to the Term Frequency/Inverse Document Frequency (TF/IDF) baseline standard. The experimental results proved the capability of using graphs and graphlet patterns in keyword extraction tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aizawa, A.: An information-theoretic perspective of tf-idf measures. Inf. Process. Manag. 39(1), 45–65 (2003)
Google Scholar
Beliga, S., Meštrović, A., Martinčić-Ipšić, S.: An overview of graph-based keyword extraction methods and approaches. J. Inf. Organ. Sci. 39(1), 1–20 (2015)
Google Scholar
Bird, S., Klein, E., Loper, E.: Natural language processing with Python. O’Reilly Media, Inc. (2009)
Google Scholar
DePiero, F., Krout, D.: An algorithm using length-r paths to approximate subgraph isomorphism. Pattern Recogn. Lett. 24(1), 33–46 (2003)
Google Scholar
Ergan, G., Radev, D.R.: Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Google Scholar
Gutwin, C., Paynter, G., Witten, I., Nevill-Manning, C., Frank, E.: Improving browsing in digital libraries with keyphrase indexes. Decis. Support Syst. 27(1), 81–104 (1999)
Google Scholar
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. Int. J. Artif. Intell. Tools 13(01), 157–169 (2004)
Google Scholar
Mihalcea, R., Tarau, P.: Textrank: bringing order into texts. Assoc. Comput. Linguist. (2004)
Google Scholar
Nabhan, A.R., Shaalan, K.: Keyword identification using text graphlet patterns. In: International Conference on Applications of Natural Language to Information Systems, pp. 152–161. Springer (2016)
Google Scholar
Ncbi.nlm.nih.gov. Home-pubmed-ncbi. http://www.ncbi.nlm.nih.gov/pubmed, August (2016)
Page, L., Brin, S., Motwani, R., Winograd, T.: The Pagerank Citation Ranking: Bringing Order to the Web (1999)
Google Scholar
Palshikar, G.K.: Keyword extraction from a single document using centrality measures. In: International Conference on Pattern Recognition and Machine Intelligence, pp. 503–510. Springer (2007)
Google Scholar
Pržulj, N.: Biological network comparison using graphlet degree distribution. Bioinformatics 23(2), e177–e183 (2007)
Google Scholar
Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining, pp. 1–20 (2010)
Google Scholar
Ruohonen, K.: Graph theory, graafiteoria lecture notes, tut (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Engineering and IT, The British University in Dubai, Dubai, UAE
Omar Alqaryouti, Hassan Khwileh, Tarek Farouk & Khaled Shaalan
Faculty of Computers and Information, Fayoum University, Fayoum, Egypt
Ahmed Nabhan
Member Technology, Sears Holdings, Hoffman Estates, USA
Ahmed Nabhan
School of Informatics, University of Edinburgh, Edinburgh, UK
Khaled Shaalan

Authors

Omar Alqaryouti
View author publications
You can also search for this author in PubMed Google Scholar
Hassan Khwileh
View author publications
You can also search for this author in PubMed Google Scholar
Tarek Farouk
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Nabhan
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Shaalan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Omar Alqaryouti .

Editor information

Editors and Affiliations

The British University in Dubai, Dubai, United Arab Emirates
Khaled Shaalan
Faculty of Computers and Information Technology, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computers and Information, Ain Shams University, Cairo, Egypt
Fahmy Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alqaryouti, O., Khwileh, H., Farouk, T., Nabhan, A., Shaalan, K. (2018). Graph-Based Keyword Extraction. In: Shaalan, K., Hassanien, A., Tolba, F. (eds) Intelligent Natural Language Processing: Trends and Applications. Studies in Computational Intelligence, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-319-67056-0_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-67056-0_9
Published: 18 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67055-3
Online ISBN: 978-3-319-67056-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics