Abstract
We present a method based on the formalism of Property Grammars to enrich the Arabic treebank ATB with syntactic constraints (so-called properties). The Property Grammar formalism is an effectively constraint-based approach that directly specifies the constraints on information categories. This can facilitate the enrichment process. The latter is based on three phases: the problem formalization, the Property Grammar induction from the ATB and the treebank regeneration with a new syntactic property-based representation. The enrichment of the ATB can make it more useful for many NLP applications such as the ambiguity resolution. This allows also the acquisition of new linguistic resources and the ease of the probabilistic parsing process. This enrichment process is purely automatic and independent from any language and source corpus formalism. This motivates its reuse. We obtained good and encouraging experiment results and various properties of different types.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Arabic Transliteration Table on Tim Buckwalter site: www.qamus.org/transliteration.htm.
References
Abdul-Mageed, M., Diab, M.: AWATIF: a multi-genre corpus for modern standard Arabic subjectivity and sentiment analysis. In: Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey (2012)
Alkuhlani, S., Habash, N.: A corpus for modeling morpho-syntactic agreement in Arabic: gender, number and rationality. In: Association for Computational Linguistics (ACL 2011), Portland, Oregon, USA (2011)
Alkuhlani, S., Habash, N., Roth, R.: Automatic morphological enrichment of a morphologically underspecified treebank. In: North American Chapter of the Association for Computational Linguistics: Human Language Technologies (HLT-NAACL 2013), pp. 460–470, Atlanta, Georgia, USA (2013)
Bensalem, R.B., Elkarwi, M.: Induction d’une grammaire de propriétés à granularité variable à partir du treebank arabe ATB. In: Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RECITAL 2014), pp. 124–135, ATALA, ACL-ontology, Marseille, France (2014)
Bahloul, R.B., Elkarwi, M., Haddar, K., Blache, P.: Building an Arabic linguistic resource from a treebank: the case of property grammar. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS (LNAI), vol. 8655, pp. 240–246. Springer, Heidelberg (2014). doi:10.1007/978-3-319-10816-2_30
Blache, P., Rauzy, S.: Hybridization and treebank enrichment with constraint-based representations. In: LREC 2012 - Workshop on Advanced Treebanking, Istanbul, Turkey (2012)
Çakıcı, R.: Automatic induction of a CCG grammar for Turkish. In: ACL Student Research Workshop, pp. 73–78, Ann Arbor, Michigan (2005)
El-taher, A.I., Abo Bakr, H.M., Zidan, I., Shaalan, K.: An Arabic CCG approach for determining constituent types from Arabic treebank. J. King Saud Univ. Comput. Inf. Sci. 1319–1578 (2014)
Habash, N., Rambow O.: Arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop. In: ACL, pp. 573–580, Ann Arbor, Michigan (2005)
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: OntoNotes: the 90% solution. In: North American Chapter of the Association for Computational Linguistics (NAACL 2006), pp. 57–60, USA (2006)
Maamouri, M., Bies, A., Buckwalter, T., Mekki, W.: The Penn Arabic treebank: building a large-scale annotated Arabic corpus. In: NEMLAR Conference on Arabic Language Resources and Tools, Cairo, Egypt (2004)
Maruyama, H.: Structural disambiguation with constraint propagation. In: ACL 1990 Workshop on Dependency-based Grammars, pp. 31–38. Pittsburgh, Pennsylvania, USA (1990)
Müller, H.H.: Annotation of morphology and NP structure in the Copenhagen Dependency Treebanks (CDT). In: International Workshop on Treebanks and Linguistic Theories, pp. 151–162, University of Tartu, Estonia (2010)
Oepen, S., Flickinger, D., Toutanova, K., Manning, C.D.: LinGO redwoods - a rich and dynamic treebank for HPSG. In: LREC 2002 - Workshop on Parsing Evaluation, Las Palmas, Spain (2002)
Palmer, M., Babko-Malaya, O., Bies, A., Diab, M., Maamouri, M., Mansouri, A., Zaghouani, W.: A pilot Arabic propbank. In: LREC 2008, Marrakech, Morocco (2008)
Pollard, C., Sag, I.: Head-driven Phrase Structure Grammars. Chicago University Press, Chicago (1994)
Tounsi, L., Attia, M., Van-Genabith, J.: Automatic treebank-based acquisition of Arabic LFG dependency structures. In: The European Chapter of the ACL (EACL) Workshop on Computational Approaches to Semitic Languages, pp. 45–52, Greece (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Bahloul, R.B., Haddar, K., Blache, P. (2016). A Property Grammar-Based Method to Enrich the Arabic Treebank ATB. In: Fred, A., Dietz, J., Aveiro, D., Liu, K., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2015. Communications in Computer and Information Science, vol 631. Springer, Cham. https://doi.org/10.1007/978-3-319-52758-1_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-52758-1_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52757-4
Online ISBN: 978-3-319-52758-1
eBook Packages: Computer ScienceComputer Science (R0)