Abstract
The salient issues facing contemporary Arabic morphological analysis are summarized as predominantly orthographic in nature, although the issue of how to integrate morphological analysis of the dialects into the existing morphological analysis of Modern Standard Arabic is identified as the primary challenge of the next decade. Issues of orthography that impact morphological analysis stem in part from the successful deployment of the Unicode standard and the subsequent increase in usage of the expanded Arabic character set, including what are properly Persian and Urdu characters. Additional orthographic issues impacting morphological analysis arise from the persistent and widespread variation in the spelling of letters such as hamza and tā’ marbūTa, and the increasing lack of differentiation between word-final yā’ and alif maqSūra. The tokenization of Arabic input strings is also affected by orthography, as typists often neglect to insert a space after words that end with a non-connector letter. An increasing number of archaic morphological features and dated lexical items can be observed in Web-based Islamic publications and cannot be overlooked in contemporary analysis. Finally, the accuracy and completeness of current Arabic morphological analysis can be questioned in light of the almost complete absence of annotation for lexically-determined features of gender, number, and humanness
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Badawi, M.G. Carter, and A. Wallace. 2004. Modern Written Arabic: A Comprehensive Grammar. Routledge, London.
Kenneth R. Beesley. 2001. Finite-state Morphological Analysis and Generation of Arabic at Xerox Research: Status and Plans in 2001, In EALC 2001 Workshop Proceedings on Arabic Language Processing: Status and Prospects, pp. 1–8, Toulouse, France, July 2001.
Kenneth R. Beesley, S. Newton, and T. Buckwalter. 1989. Two-Level Finite-State Analysis of Arabic Morphology, In Proceedings of the Seminar on Bilingual Computing in Arabic and English, no pagination, University of Cambridge, U.K., September 1989.
T. Buckwalter. 2004a. Buckwalter Arabic Morphological Analyzer Version 2.0. Linguistic Data Consortium, catalog number LDC2004L02 and ISBN 1-58563-324-0.
T. Buckwalter. 2004b. Issues in Arabic Orthography and Morphology Analysis. In Proceedings of the Workshop on Computational Approaches to Arabic Script-based Languages, COLING 2004, pp. 31–34, Geneva, August 2004.
M. Maamouri, A. Bies, T. Buckwalter, and W. Mekki. 2004. The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus. Paper presented at the NEMLAR International Conference on Arabic Language Resources and Tools, Cairo, Sept. 22–23, 2004.
Otakar Smrž. in prep. Functional Arabic Morphology. Formal System and Implementation. Ph.D. thesis, Charles University in Prague.
The Unicode Consortium. 2003. The Unicode Standard, version 4.0. Boston, Addison-Wesley.
H. Wehr. 1979 A Dictionary of Modern Written Arabic. 4th edition, edited. by J. Milton Cowan. Wiesbaden, Harrassowitz.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Buckwalter, T. (2007). Issues in Arabic Morphological Analysis. In: Soudi, A., Bosch, A.v., Neumann, G. (eds) Arabic Computational Morphology. Text, Speech and Language Technology, vol 38. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6046-5_3
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6046-5_3
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6045-8
Online ISBN: 978-1-4020-6046-5
eBook Packages: Humanities, Social Sciences and LawSocial Sciences (R0)