Hierarchical Amharic Base Phrase Chunking Using HMM with Error Pruning

  • Abeba Ibrahim
  • Yaregal AssabieEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9561)


Segmentation of a text into non-overlapping syntactic units (chunks) has become an essential component of many applications of natural language processing. This paper presents Amharic base phrase chunker that groups syntactically correlated words at different levels using HMM. Rules are used to correct chunk phrases incorrectly chunked by the HMM. For the identification of the boundary of the phrases IOB2 chunk specification is selected and used in this work. To test the performance of the system, corpus was collected from Amharic news outlets and books. The training and testing datasets were prepared using the 10-fold cross validation technique. Test results on the corpus showed an average accuracy of 85.31 % before applying the rule for error correction and an average accuracy of 93.75 % after applying rules.


Amharic language processing Base phrase chunking Partial parsing 


  1. Abney, S.: Parsing by chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing. Kluwer Academic Publishers, Dordrecht (1991)Google Scholar
  2. Abney, S.: Chunks and dependencies: bringing processing evidence to bear on syntax. In: Computational Linguistics and the Foundations of Linguistic Theory. CSLI (1995)Google Scholar
  3. Ali, W., Hussain, S.: A hybrid approach to Urdu verb phrase chunking. In: Proceedings of the 8th Workshop on Asian Language Resources (ALR-8), COLING-2010, Beijing, China (2010)Google Scholar
  4. Amare, G.: Open image in new window (Modern Amharic Grammar in a Simple Approach). Addis Ababa, Ethiopia (2010)Google Scholar
  5. Brants, T.: Cascaded markov models. In: Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics, EACL 1999, Bergen, Norway (1999)Google Scholar
  6. Kutlu, M.: Noun phrase chunker for Turkish using dependency parser. Doctoral dissertation, Bilkent University (2010)Google Scholar
  7. Lewis, P., Simons, F., Fennig, D.: Ethnologue: Languages of the World, 17th edn. SIL International, Dallas (2013)Google Scholar
  8. Molina, A., Pla, F.: Shallow parsing using specialized HMMs. J. Mach. Learn. Res. 2, 595–613 (2002)zbMATHGoogle Scholar
  9. Ramshaw, A., Marcus, P.: Text chunking using transformation-based learning. In: Proceedings of the Third ACL Workshop on Very Large Corpora, pp. 82–94 (1995)Google Scholar
  10. Thao, H., Thai, P., Minh N., Thuy, Q.: Vietnamese noun phrase chunking based on conditional random fields. In: International Conference on Knowledge and Systems Engineering (KSE 2009), pp. 172–178 (2009)Google Scholar
  11. Tjong, E.F., Sang, K., Buchholz, S.: Introduction to the CoNLL-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, vol. 7, pp. 127–132 (2000)Google Scholar
  12. Xu, F., Zong, C., Zhao, J.: A hybrid approach to Chinese base noun phrase chunking. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, Sydney (2006)Google Scholar
  13. Yangarber, R., Grishman, R.: NYU: description of the Proteus/PET system as used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference, MUC-7, Washington, DC (1998)Google Scholar
  14. Yimam, B.: Open image in new window (Amharic Grammar). Addis Ababa, Ethiopia (2000)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Department of Computer ScienceAddis Ababa UniversityAddis AbabaEthiopia

Personalised recommendations