Skip to main content

A Hybrid Approach to the Development of Part-of-Speech Tagger for Kafi-noonoo Text

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8403))

  • 2056 Accesses

Abstract

Although natural language processing (NLP) is now a popular area of research and development, less-resourced languages are not receiving much attention from developers. One of such under-resourced languages is Kafi-noonoo which is spoken in the south-western regions of Ethiopia. This paper presents the development of part-of-speech tagger for Kafi-noonoo. In order to develop the tagger, we employed a hybrid of two systems: statistical and rule-based taggers. The lexical and transitional probabilities of word classes are modeled using HMM. However, due to the limitation of corpus for the language, a set of transformation rules are applied to improve the result. The system was tested with test corpus and, with 90% of the corpus used for training, the hybrid tagger yielded an accuracy of 80.47%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, J.: Natural language Understanding. The Benjamin/Cummings Publishing Company, Redwood (1995)

    MATH  Google Scholar 

  2. Altunyurt, L., Orhan, Z., Güngör, T.: A Composite Approach for Part of Speech Tagging in Turkish. In: Proceeding of International Scientific Conference on Computer Science, Istanbul, Turkey (2006)

    Google Scholar 

  3. Bird, S., Klein, E., Loper, E.: Natural Language processing with python: Analyzing text with the natural language toolkit. O’Reilly Media, Cambridge (2009)

    Google Scholar 

  4. Brill, E.: Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging. Computational Linguistics 21(4), 543–565 (1995)

    Google Scholar 

  5. Dand, S., Sarkar, S., Basu, A.: Automatic Part-of-Speech Tagging for Bengali: An Approach for Morphologically Rich Languages in a Poor Resource Scenario. In: Department of Computer Science and Engineering, Kharagpur, India, Indian Institute of Technology (2007)

    Google Scholar 

  6. Harold, F.: The non-Semitic languages of Ethiopia. Michigan State University, Michigan (1976)

    Google Scholar 

  7. Jurafsky, D., Martin, J.: Speech and Language Processing: An Introduction to Natural Speech Recognition. Prentice-Hall, New Jersey (2000)

    Google Scholar 

  8. Mamo, G., Meshesha, M.: Part-of-Speech Tagging for Afaan Oromo Language. Inter. Journal of Advanced Computer Science and Applications 1(3), 1–5 (2011)

    Article  Google Scholar 

  9. Nivre, J.: Sparse data and smoothing in statistical part-of-speech tagging. Journal of Quantitative Linguistics, 1–17 (2000)

    Google Scholar 

  10. Zin, K.: Hidden Markov Model with Rule Based Approach for Part of Speech Tagging of Myanmar Language. In: Proceedings of the 3rd International Conference on Communications and Information Technology, Florida, pp. 123–128 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mekuria, Z., Assabie, Y. (2014). A Hybrid Approach to the Development of Part-of-Speech Tagger for Kafi-noonoo Text . In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8403. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54906-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54906-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54905-2

  • Online ISBN: 978-3-642-54906-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics