Abstract
In this paper, we present a machine learning approach for Arabic pronominal anaphora resolution. This approach resolves anaphoric pronouns without using linguistic or domain knowledge, nor deep parsing. It relies on some features which are widely used in the literary for other languages such as English. In addition, we propose new features specific for Arabic language. We provide a practical implementation of this approach which has been evaluated on three data sets (a technical manual, newspaper articles and educational texts). The results of evaluation shows that our approach provide good performance for resolving the Arabic pronominal anaphora. The measures of F-measure are respectively 86.2% for the genre of technical manuals, 84.5% for newspaper articles and 72.1% for the literary texts.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Weka: http://weka.wikispaces.com/.
- 2.
The parameters for these classifiers are those of Weka’s default settings.
References
Belguith, H., Baccour, L., Mourad, G.: Segmentation de textes arabes base sur l’analyse contextuelle des signes de ponctuations et de certaines particules. In: 12me confrence sur le Traitement Automatique des Langues Naturelles, Dourdan France, pp. 451–456 (2005)
Bobrow, D.: A question-answering system for high school algebra word problems. In: Proceedings of AFIPS Conference (1964)
Carbonell, J., Brown, D.: Anaphora resolution: a multi-strategy approach. In: Proceedings of the 12th International Conference (1988)
Elghamry, K., Al-Sabbagh, R., El-Zeiny, N.: Arabic anaphora resolution using web as corpus. In: Proceedings of the Seventh Conference on Language Engineering. Cairo, Egypt (2007)
Hammami, M., Belguith, H., Ben Hamadou, A.: Anaphora in Arabic language: developing a corpora annotating tool for anaphoric links. In: 9th International Arab Conference on Information Technology ACIT2008, Hammamet, Tunisia (2008)
Hobbs, J.: Resolving pronoun references. Lingua 44, 339–352 (1978)
Luo, X., Zitouni, I.: Multi-lingual coreference resolution with syntactic features. In: Proceedings of the seventh conference on Language Engineering, pp. 660–667, Cairo, Egypt (2005)
Mitkov, R.: Robust pronoun resolution with limited knowledge. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING98)/ACL98 Conference, Montreal, Canada, pp. 869–875 (1998)
Mitkov, R.: Anaphora Resolution. Longman, New York (2002)
Mitkov, R., Belguith, H., Malgorzata, S.: Multilingual robust anaphora resolution. In: Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, Granada, Spain, pp. 7–16 (1998)
Rambow, O., Habash, N., Roth, R.: Mada+tokan: a toolkit for Arabic tokenization, diacritization, morphological disambiguation, pos tagging, stemming and lemmatization. In: Proceedings of the 2nd International Conference on Arabic Language Resources and Tools. Cairo, Egypt, April 2009
Recasens, M., Hovy, E.: A deeper look into features for coreference resolution. In: Lalitha Devi, S., Branco, A., Mitkov, R. (eds.) DAARC 2009. LNCS (LNAI), vol. 5847, pp. 29–42. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04975-0_3
Rich, E., LuperFoy, S.: An architecture for anaphora resolution. In: Proceedings of the Second Conference on Applied Natural Language Processing, Austin, Texas (1988)
Sobha, L., Patnaik, B.: Vasisth: an anaphora resolution system for Indian languages. In: Proceedings of International Conference on Artificial and Computational Intelligence for Decision, Control and Automation in Engineering and Industrial Applications, Monastir, Tunisia (2000)
Sobha, L.D., Vijay, S.R., Pattabhi, R.R.: Resolution of pronominal anaphors using linear and tree CRFs. In: 8th DAARC, Faro, Portugal (2011)
Soon, W., Ng, H., Lim, D.: A machine learning approach to coreference resolution of noun phrases. Comput. Ling. 27(4), 521–544 (2001)
Strube, M., Hahn, U.: Functional centering: grounding referential coherence in information structure. Comput. Ling. 27(4), 309–344 (1999)
Tutin, A.: A corpus-based study of pronominal anaphoric expressions in French. In: Proceedings of DAARC 2002 (2002)
Wick, M., Singh, S., McCallum, A.: A discriminative hierarchical model for fast coreference at large scale. In: Proceedings of ACL 2012 (2012)
Winograd, T.: Understanding Natural Language. Academic Press, New York (1972)
Zitouni, I., Luo, X., Florian, R.: A statistical model for Arabic mention detection and chaining, pp. 199–236. CSLI Publications, Center for the Study of Language and Information, Stanford (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Hammami, S.M., Belguith, L.H. (2018). Arabic Pronominal Anaphora Resolution Based on New Set of Features. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2016. Lecture Notes in Computer Science(), vol 9623. Springer, Cham. https://doi.org/10.1007/978-3-319-75477-2_38
Download citation
DOI: https://doi.org/10.1007/978-3-319-75477-2_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75476-5
Online ISBN: 978-3-319-75477-2
eBook Packages: Computer ScienceComputer Science (R0)