Abstract
This paper introduces a new two-level error tagset, AALETA (Alfaifi Atwell Leeds Error Tagset for Arabic), to be used for annotating the Arabic Learner Corpora (ALC). The new tagset includes six broad classes, subdivided into 37 more specific error types or subcategories. It is easily understood by Arabic corpus error annotators. AALEETA is based on an existing error tagset for Arabic corpora, ARIDA, created by Abuhakema et al. [1], and a number of other error-analysis studies. It was used to annotate texts of the Arabic Learner Corpus [2]. The paper shows the tagset broad classes and types or subcategories and an example of annotation. The understandability of AALETA was measured against that of ARIDA, and the preliminary results showed that AALETA achieved a slightly higher score. Annotators reported that they preferred using AALETA over ARIDA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abuhakema, G., Feldman, A., Fitzpatrick, E.: ARIDA: An Arabic Interlanguage Database and Its Applications: A Pilot Study. Journal of the National Council of Less Commonly Taught Languages (JNCOLCTL) 7, 161–184 (2009)
Alfaifi, A. and E. Atwell. المدونات اللغوية لمتعلمي اللغة العربية: نظامٌ لتصنيف وترميز الأخطاء اللغوية (in Arabic)"Arabic Learner Corpora (ALC): A Taxonomy of Coding Errors". in 8th International Computing Conference in Arabic (ICCA 2012) 26-28 December 2012. 2012. Cairo, Egypt.
Granger, S.: The International Corpus of Learner English: A New Resource for Foreign Language Learning and Teaching and Second Language Acquisition Research. TESOL Quarterly 37(3), 538–546 (2003)
Nesselhauf, N.: Learner Corpora and Their Potential in Language Teaching. In: Sinclair, J. (ed.) How to Use Corpora in Language Teaching, pp. 125–152. Benjamins, Amsterdam (2004)
Buttery, P., Caines, A.: Normalising Frequency Counts to Account for ‘opportunity of use’ in Learner Corpora. In: Tono, Y., Kawaguchi, Y., Minegishi, M. (eds.) Developmental and Crosslinguistic Perspectives in Learner Corpus Research, pp. 187–204. John Benjamins, Amsterdam (2012)
Meunier, F., et al.: The LONGDALE (Longitudinal Database of Learner English), [cited 2012, September 14] (2010), http://www.uclouvain.be/en-cecl-longdale.html
Diez-Bedmar, M.B.: Written Learner Corpora by Spanish Students of English: an overview. In: Gómez, P.C., Pére, A.S. (eds.) A Survey on Corpus-based Research, Proceedings of the AELINCO Conference, pp. 920–933. Asociación Española de Lingüística del Corpus, Murcia (2009)
Hammarberg, B.: Introduction to the ASU Corpus, a Longitudinal Oral and Written Text Corpus of Adult Learners’ Swedish with a Corresponding Part from Native Swedes. Stockholm University, Department of Linguistics (2010)
Dagneaux, E., et al.: Error tagging manual (1996)
Granger, S.: Error-tagged Learner Corpora and CALL: A Promising Synergy. CALICO Journal 20(3), 465–480 (2003)
Nicholls, D.: The Cambridge Learner Corpus - error coding and analysis for lexicography and ELT. In: Corpus Linguistics 2003 Conference (CL 2003), Lancaster, UK (2003)
Izumi, E., Uchimoto, K., Isahara, H.: Error anotation for corpus of Japanese learner English. In: Sixth International Workshop on Linguistically Interpreted Corpora (LINC 2005), Jeju Island, Korea, October 15 (2005)
Alosaili, A.I., الأخطاء الشائعة في الكلام لدى طلاب اللغة العربية الناطقين بلغات أخرى: دراسة وصفية تحليلية (in Arabic) "Common Errors in Speech Production of Non-Native Arabic Learners". 1985, Al Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia.
Alateeq, Z.M., تحليل الأخطاء الدلالية لدى دارسي اللغة العربية من غير الناطقين بها في مادة التعبير الكتابي (in Arabic) "Semantic Errors Analysis of Non-Native Arabic Learners in Writing". 1992, Al Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia.
Alhamad, M.M.: تحليل أخطاء التعبير الكتابي لدى المستوى المتقدم من دارسي العربية غير الناطقين بها في جامعة الملك سعود (in Arabic)"Writing Errors Analysis of Advanced-Level Arabic Learners at King Saud University. Al Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia (1994)
Alaqeeli, A.S.: تحليل الأخطاء في بعض أنماط الجملة الفعلية للغة العربية في الأداء الكتابي لدى دارسي المستوى المتقدم (in Arabic). Error Analysis in Some Verbal Sentence Patterns of Arabic in Writing Production of Advanced-Level Learners, Al Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Alfaifi, A., Atwell, E., Abuhakema, G. (2013). Error Annotation of the Arabic Learner Corpus. In: Gurevych, I., Biemann, C., Zesch, T. (eds) Language Processing and Knowledge in the Web. Lecture Notes in Computer Science(), vol 8105. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40722-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-40722-2_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40721-5
Online ISBN: 978-3-642-40722-2
eBook Packages: Computer ScienceComputer Science (R0)