Automatic Morpheme Slot Identification Using Genetic Algorithm

  • Wondwossen MulugetaEmail author
  • Michael Gasser
  • Baye Yimam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9561)


We introduce an approach to the grouping of morphemes into suffix slots in morphologically complex languages using genetic algorithm. The method is applied to verbs in Amharic, an under-resourced morphologically rich Semitic language, with a number of non-concatenative prefix and suffix morphemes. We start with a limited set of segmented verbs and the set of suffixes themselves, extracted on the basis of our previous work. Each member of the population for the genetic algorithm is an assignment of the morphemes to one of the possible slots. The fitness function combines scores for exact slot position and correct ordering of morphemes. We use mutation but no crossover operator with various combinations of population size, mutation rate, and number of generations, and models evolve to yield promising morpheme classification results with 90.02 % accuracy level. We evaluate the fittest individuals on the basis of the known morpheme classes for Amharic.


Amharic Morpheme slots Genetic algorithm Morphological analysis Machine learning 


  1. 1.
    Beesley, K.R., Karttunen, L.: Finite State Morphology, CSLI Studies in Computational Linguistics, vol. 3. CSLI Publications, Stanford (2003)Google Scholar
  2. 2.
    Bender, M.L.: Amharic verb morphology: a generative approach. Ph.D. thesis, Graduate School of Texas (1968)Google Scholar
  3. 3.
    De Pauw, G., Wagacha, P.W.: Bootstrapping morphological analysis of gikuyu using unsupervised maximum entropy learning. In: Proceedings of the Eighth INTERSPEECH Conference, Antwerp, Belgium (2007)Google Scholar
  4. 4.
    Goldsmith, J.: The unsupervised learning of natural language morphology. Comput. Linguist. 27, 153–198 (2001)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Hammarström, H., Borin, L.: Unsupervised learning of morphology. Comput. Linguist. 37(2), 309–350 (2011)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Holland, J.H.: Adapt. Nat. Artif. Syst. MIT Press, Cambridge (1992)Google Scholar
  7. 7.
    Kaplan, R.M., Kay, M.: Regular models of phonological rule systems. Comput. Linguist. 20(3), 331–378 (1994)Google Scholar
  8. 8.
    Karttunen, L., Kaplan, R.M., Zaenen, A.: Two level morphology with composition. In: Proceedings of the International Conference on Computational Linguistics, vol. 14, no. 1, pp. 141–148 (1992)Google Scholar
  9. 9.
    Kazakov, D.: Achievements and prospects of learning word morphology with inductive logic programming. In: Cussens, J., Džeroski, S. (eds.) LLL 1999. LNCS (LNAI), vol. 1925, pp. 89–109. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  10. 10.
    Koskenniemi, K.: Two level morphology: a general computational model for word-form recognition and production. In: Proceedings of the 10th International Conference on Computational Linguistics-COLING 1984. Association for Computational Linguistics, pp. 178–181 (1984)Google Scholar
  11. 11.
    Manandhar, S., Džeroski, S., Erjavec, T.: Learning multilingual morphology with CLOG. In: Page, David L. (ed.) ILP 1998. LNCS, vol. 1446, pp. 135–144. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  12. 12.
    Mooney, R.J.: Inductive logic programming. In: Mitkov, R. (ed.) Oxford Handbook of Computational Linguistics, pp. 376–394. Oxford University Press, Oxford (1997)Google Scholar
  13. 13.
    Mooney, R.J.: Machine learning. In: Mitkov, R. (ed.) Oxford Handbook of Computational Linguistics, pp. 376–394. Oxford University Press, Oxford (2003)Google Scholar
  14. 14.
    Wondwossen, M., Gasser, M., Baye, Y.: Incremental learning of affix segmentation. In: Proceedings of the 24th International Conference on Computational Linguistics-COLING 2012, pp. 1901–1914. Association for Computational Linguistics (ACL), Mumbai, India (2012)Google Scholar
  15. 15.
    Wondwossen, M., Gasser, M.: Learning morphological rules for Amharic verbs using inductive logic programming. In: Proceedings of SALTMIL-AfLaT Workshop on Language Technology for Normalisation of Less-Resourced Languages, Istanbul, Turkey, pp. 7–12 (2012)Google Scholar
  16. 16.
    Oflazer, K., Nirenburg, S., McShane, M.: Bootstrapping morphological analyzers by combining human elicitation and machine learning. Comput. Linguist. 27(1), 59–85 (2001)CrossRefGoogle Scholar
  17. 17.
    Spiegler, S.R.: Machine learning for the analysis of morphologically complex languages. Ph.D. thesis. University of Bristol (2011)Google Scholar
  18. 18.
    Baye, Y.: Yamarigna Sewasiw (Amharic Grammar). EMPDA Publications, Addis Ababa (1995)Google Scholar
  19. 19.
    Ivanovska, A., Zdravkova, K., Džeroski, S., Erjavec, T.: Learning rules for morphological analysis and synthesis of Macedonian nouns. In: Proceedings of SiKDD-2005 Conference on Data Mining and Data Warehouses, Ljubljana, Sloveniapp, pp. 195–198 (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Wondwossen Mulugeta
    • 1
    Email author
  • Michael Gasser
    • 2
  • Baye Yimam
    • 1
  1. 1.Addis Ababa UniversityAddis AbabaEthiopia
  2. 2.Indiana UniversityBloomingtonUSA

Personalised recommendations