Abstract
Keyphrase extraction is a critical step in many natural language processing and Information retrieval applications. In this paper, we introduce AKEA, a keyphrase extraction algorithm for single Arabic documents. AKEA is an unsupervised algorithm as it does not need any type of training in order to achieve its task. We rely on heuristics that collaborate linguistic patterns based on Part-Of-Speech (POS) tags, statistical knowledge, and the internal structural pattern of terms (i.e. word-occurrence). We employ the usage of Arabic Wikipedia to improve the ranking (or significance) of candidate keyphrases by adding a confidence score if the candidate exist as an indexed Wikipedia concept. Experimental results show that on average AKEA has the highest precision value, the highest F-measure value which indicates it presents more accurate results compared to its other algorithms
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jean-Louis, L., Zouaq, A., Gagnon, M., Ensan, F.: An assessment of online semantic annotators for the keyword extraction task. In: Pham, D.-N., Park, S.-B. (eds.) PRICAI 2014. LNCS, vol. 8862, pp. 548–560. Springer, Heidelberg (2014)
Harb, H., Fouad, K., Nagdy, N.: Semantic retrieval approach for web documents. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 2(9), 4673–4681 (2011)
Fouad, K., Khalifa, A., Nagdy, N., Harb, H.: Web-based semantic and personalized information retrieval. Int. J. Comput. Sci. Iss. (IJCSI) 9(3), 3 (2012)
Babekr, S., Fouad, K. Arshad, N.: Personalized semantic retrieval and summarization of web based documents. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 4(1) (2013)
Fouad, K., Hassan, M.: Agent for documents clustering using semantic-based model and fuzzy. Int. J. Comput. Appl. (0975–8887) 62(3), 10–16 (2013)
Gupta, V., Lehal, G.: A survey of text mining techniques and applications. J. Emerg. Technol. Web Intell. 1(1), 60–76 (2009)
Wang, R., Liu, W., McDonald, C.: How preprocessing affects unsupervised keyphrase extraction. In: Gelbukh, A. (ed.) CICLing 2014, Part I. LNCS, vol. 8403, pp. 163–176. Springer, Heidelberg (2014)
Deng, Z., Zhu, X., Cheng, D., Zong, M., Zhang, S.: Efficient KNN classification algorithm for big data. Neurocomputing 195, 143–148 (2016)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. In: Hastie, T., Tibshirani, R., Friedman, J. (eds.) Unsupervised Learning, 2nd edn. Springer, New York (2008)
Aliaa, A., Ghalwash, Y., Amer, E.: KPE: an automatic keyphrase extraction algorithm. In: IEEE Proceeding of International Conference on Information Systems and Computational Intelligence (ICISCI 2011), pp. 103–107 (2011)
El-Beltagy, S., Rafea, A.: KP-Miner: a keyphrase extraction system for English and Arabic documents. Inf. Syst. 34, 132–144. Elsevier B.V. (2009)
Wong, W., Liu, W., Bennamoun, M.: Ontology learning from text: a look back and into the future. ACM Comput. Surv. 44, 20:1–20:36 (2012)
You, W., Fontaine, D., Barthès, J.-P.: An automatic keyphrase extraction system for scientific documents. Knowl. Inf. Syst. 34, 691–724. Springer (2013)
Hong, B., Zhen, D.: An extended keyword extraction method. In: 2012 International Conference on Applied Physics and Industrial Engineering. Physics Procedia, vol. 24, pp. 1120–1127. Elsevier B.V. (2012)
El-Ghannam, F., El-Shishtawy, T.: Multi-topic multi-document summarizer. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 5(6), 77–90 (2013)
Al-Saleh, A., Menai, M.: Automatic Arabic text summarization: a survey. Artif. Intell. Rev. 45, 203–234 (2016)
Huang, Y.-F., Ciou, C.-S.: Constructing personal knowledge base: automatic key-phrase extraction from multiple-domain web pages. In: Cao, L., Huang, J.Z., Bailey, J., Koh, Y.S., Luo, J. (eds.) PAKDD Workshops 2011. LNCS, vol. 7104, pp. 65–76. Springer, Heidelberg (2012)
Paukkeria, M., Garca-Plazab, A., Fresnob, V., Unanueb, R., Honkelaa, T.: Learning a taxonomy from a set of text documents. Appl. Soft Comput. 12, 1138–1148. Elsevier B.V. (2012)
Chen, Y., Yin, J., Zhu, W., Qiu, S.: Novel word features for keyword extraction. In: Dong, X.L., Yu, X., Sun, Y., Dong, X.L., Li, J., Sun, Y. (eds.) WAIM 2015. LNCS, vol. 9098, pp. 148–160. Springer, Heidelberg (2015). doi:10.1007/978-3-319-21042-1_12
Rodas, A.: Semantic metadata extraction from open domain texts in natural language. Master of Science in Computer Engineering University Of Puerto Rico Mayaguez Campus. ProQuest LLC (2013)
Qureshi, M., O’Riordan, C., Pasi, G.: Short-text domain specific key terms/phrases extraction using an n-gram model with wikipedia. ACM (2012). 978-1-4503-1156-4/12/10
Saad, S., Salim, N., Omar, N.: Keyphrase extraction for Islamic knowledge ontology. IEEE (2008). 978-1-4244-2328-6/08
Hulth, A.: Improved automatic keyword extraction given more linguistic knowledge. In: Collins, M., Steedman, M. (eds.) Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing, pp. 216–223 (2003)
Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowledge. In: Proceedings of the 23rd National Conference on Artificial intelligence, AAAI 2008, vol. 2, pp. 855–860. AAAI Press (2008)
Khoja, S., Garside, R., Knowles, G.: An Arabic tagset for the morphosyntactic tagging of Arabic (2001)
Pu, M.: Fundamental data Compression, 1st edn. Elsevier, UK (2006)
Kumar, N., Srinathan, K.: Automatic keyphrase extraction from scientific documents using N-gram filtration technique. In: Proceeding of the Eighth ACM Symposium on Document Engineering, pp. 199–208 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Amer, E., Foad, K. (2017). AKEA: An Arabic Keyphrase Extraction Algorithm. In: Hassanien, A., Shaalan, K., Gaber, T., Azar, A., Tolba, M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016. AISI 2016. Advances in Intelligent Systems and Computing, vol 533. Springer, Cham. https://doi.org/10.1007/978-3-319-48308-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-48308-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48307-8
Online ISBN: 978-3-319-48308-5
eBook Packages: EngineeringEngineering (R0)