AKEA: An Arabic Keyphrase Extraction Algorithm

Amer, Eslam; Foad, Khaled

doi:10.1007/978-3-319-48308-5_14

AKEA: An Arabic Keyphrase Extraction Algorithm

Eslam Amer⁷ &
Khaled Foad⁸

Conference paper
First Online: 18 October 2016

2715 Accesses
5 Citations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 533))

Abstract

Keyphrase extraction is a critical step in many natural language processing and Information retrieval applications. In this paper, we introduce AKEA, a keyphrase extraction algorithm for single Arabic documents. AKEA is an unsupervised algorithm as it does not need any type of training in order to achieve its task. We rely on heuristics that collaborate linguistic patterns based on Part-Of-Speech (POS) tags, statistical knowledge, and the internal structural pattern of terms (i.e. word-occurrence). We employ the usage of Arabic Wikipedia to improve the ranking (or significance) of candidate keyphrases by adding a confidence score if the candidate exist as an indexed Wikipedia concept. Experimental results show that on average AKEA has the highest precision value, the highest F-measure value which indicates it presents more accurate results compared to its other algorithms

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Jean-Louis, L., Zouaq, A., Gagnon, M., Ensan, F.: An assessment of online semantic annotators for the keyword extraction task. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS, vol. 8862, pp. 548–560. Springer, Heidelberg (2014)
Google Scholar
Harb, H., Fouad, K., Nagdy, N.: Semantic retrieval approach for web documents. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 2(9), 4673–4681 (2011)
Google Scholar
Fouad, K., Khalifa, A., Nagdy, N., Harb, H.: Web-based semantic and personalized information retrieval. Int. J. Comput. Sci. Iss. (IJCSI) 9(3), 3 (2012)
Google Scholar
Babekr, S., Fouad, K. Arshad, N.: Personalized semantic retrieval and summarization of web based documents. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 4(1) (2013)
Google Scholar
Fouad, K., Hassan, M.: Agent for documents clustering using semantic-based model and fuzzy. Int. J. Comput. Appl. (0975–8887) 62(3), 10–16 (2013)
Google Scholar
Gupta, V., Lehal, G.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)
Google Scholar
Wang, R., Liu, W., McDonald, C.: How preprocessing affects unsupervised keyphrase extraction. In: Gelbukh, A. (ed.) CICLing 2014, Part I. LNCS, vol. 8403, pp. 163–176. Springer, Heidelberg (2014)
Chapter Google Scholar
Deng, Z., Zhu, X., Cheng, D., Zong, M., Zhang, S.: Efficient KNN classification algorithm for big data. Neurocomputing 195, 143–148 (2016)
Article Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. In: Hastie, T., Tibshirani, R., Friedman, J. (eds.) Unsupervised Learning, 2nd edn. Springer, New York (2008)
Google Scholar
Aliaa, A., Ghalwash, Y., Amer, E.: KPE: an automatic keyphrase extraction algorithm. In: IEEE Proceeding of International Conference on Information Systems and Computational Intelligence (ICISCI 2011), pp. 103–107 (2011)
Google Scholar
El-Beltagy, S., Rafea, A.: KP-Miner: a keyphrase extraction system for English and Arabic documents. Inf. Syst. 34, 132–144. Elsevier B.V. (2009)
Google Scholar
Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. ACM Comput. Surv. 44, 20:1–20:36 (2012)
Article MATH Google Scholar
You, W., Fontaine, D., Barthès, J.-P.: An automatic keyphrase extraction system for scientific documents. Knowl. Inf. Syst. 34, 691–724. Springer (2013)
Google Scholar
Hong, B., Zhen, D.: An extended keyword extraction method. In: 2012 International Conference on Applied Physics and Industrial Engineering. Physics Procedia, vol. 24, pp. 1120–1127. Elsevier B.V. (2012)
Google Scholar
El-Ghannam, F., El-Shishtawy, T.: Multi-topic multi-document summarizer. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 5(6), 77–90 (2013)
Google Scholar
Al-Saleh, A., Menai, M.: Automatic Arabic text summarization: a survey. Artif. Intell. Rev. 45, 203–234 (2016)
Article Google Scholar
Huang, Y.-F., Ciou, C.-S.: Constructing personal knowledge base: automatic key-phrase extraction from multiple-domain web pages. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds.) PAKDD Workshops 2011. LNCS, vol. 7104, pp. 65–76. Springer, Heidelberg (2012)
Chapter Google Scholar
Paukkeria, M., Garca-Plazab, A., Fresnob, V., Unanueb, R., Honkelaa, T.: Learning a taxonomy from a set of text documents. Appl. Soft Comput. 12, 1138–1148. Elsevier B.V. (2012)
Google Scholar
Chen, Y., Yin, J., Zhu, W., Qiu, S.: Novel word features for keyword extraction. In: Dong, X.L., Yu, X., Sun, Y., Dong, X.L., Li, J., Sun, Y. (eds.) WAIM 2015. LNCS, vol. 9098, pp. 148–160. Springer, Heidelberg (2015). doi:10.1007/978-3-319-21042-1_12
Chapter Google Scholar
Rodas, A.: Semantic metadata extraction from open domain texts in natural language. Master of Science in Computer Engineering University Of Puerto Rico Mayaguez Campus. ProQuest LLC (2013)
Google Scholar
Qureshi, M., O’Riordan, C., Pasi, G.: Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. ACM (2012). 978-1-4503-1156-4/12/10
Google Scholar
Saad, S., Salim, N., Omar, N.: Keyphrase extraction for Islamic knowledge ontology. IEEE (2008). 978-1-4244-2328-6/08
Google Scholar
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Collins, M., Steedman, M. (eds.) Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003)
Google Scholar
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial intelligence, AAAI 2008, vol. 2, pp. 855–860. AAAI Press (2008)
Google Scholar
Khoja, S., Garside, R., Knowles, G.: An Arabic tagset for the morphosyntactic tagging of Arabic (2001)
Google Scholar
Pu, M.: Fundamental data Compression, 1st edn. Elsevier, UK (2006)
Google Scholar
Kumar, N., Srinathan, K.: Automatic keyphrase extraction from scientific documents using N-gram filtration technique. In: Proceeding of the Eighth ACM Symposium on Document Engineering, pp. 199–208 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Computers and Information, Computer Science Department, Banha University, Banha, Egypt
Eslam Amer
Faculty of Computers and Information, Information System Department, Banha University, Banha, Egypt
Khaled Foad

Authors

Eslam Amer
View author publications
You can also search for this author in PubMed Google Scholar
Khaled Foad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eslam Amer .

Editor information

Editors and Affiliations

Faculty of Computers & Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Dubai International Academic City, The British University, Dubai, United Arab Emirates
Khaled Shaalan
CS Dept. Faculty of Computers and Inform, Suez Canal University CS Dept. Faculty of Computers and Inform, Ismailia, Egypt
Tarek Gaber
Ahmed Orabi Square , Menouf, Egypt
Ahmad Taher Azar
Faculty of Computer & Information Scienc, Ain Shams University Faculty of Computer & Information Scienc, Cairo, Egypt
M. F. Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amer, E., Foad, K. (2017). AKEA: An Arabic Keyphrase Extraction Algorithm. In: Hassanien, A., Shaalan, K., Gaber, T., Azar, A., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016. AISI 2016. Advances in Intelligent Systems and Computing, vol 533. Springer, Cham. https://doi.org/10.1007/978-3-319-48308-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-48308-5_14
Published: 18 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48307-8
Online ISBN: 978-3-319-48308-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics