Skip to main content

Automatic Recognition of Czech Derivational Prefixes

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3406))

Abstract

This paper describes the application of a method for the automatic, unsupervised recognition of derivational prefixes of Czech words. The technique combines two statistical measures — Entropy and the Economy Principle. The data were taken from the list of almost 170 000 lemmas of the Czech National Corpus

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hlaváčová, J.: Morphological Guesser of Czech Words. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 70–75. Springer, Heidelberg (2001)

    Google Scholar 

  2. Medina Urrea, A.: Automatic Discovery of Affixes by Means of a Corpus: A Catalog of Spanish Affixes. Journal of Quantitative Linguistics 7, 97–114 (2000)

    Article  Google Scholar 

  3. Medina Urrea, A., Buenrostro Díaz, E.C.: Características cuantitativas de la flexión verbal del chuj. Estudios de Lingüística Aplicada 38, 15–31 (2003)

    Google Scholar 

  4. Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana (1949)

    MATH  Google Scholar 

  5. de Kock, J., Bossaert, W.: Introducción a la lingüística automática en las lenguas románicas. Estudios y Ensayos, vol. 202. Gredos, Madrid (1974)

    Google Scholar 

  6. de Kock, J., Bossaert, W.: The Morpheme. An Experiment in Quantitative and Computational Linguistics. Van Gorcum, Amsterdam (1978)

    Google Scholar 

  7. Hafer, M.A., Weiss, S.F.: Word Segmentation by Letter Successor Varieties. Information Storage and Retrieval 10, 371–385 (1974)

    Article  Google Scholar 

  8. Frakes, W.B.: “Stemming Algorithms”. In: Information Retrieval, Data Structures and Algorithms, pp. 131–160. Prentice Hall, New Jersey (1992)

    Google Scholar 

  9. Oakes, M.P.: Statistics for Corpus Linguistics. Edinburgh University Press, Edinburgh (1998)

    Google Scholar 

  10. Medina Urrea, A.: Investigación cuantitativa de afijos y clíticos del español de México. Glutinometría en el Corpus del Español Mexicano Contemporáneo. PhD thesis, El Colegio de México, Mexico (2003)

    Google Scholar 

  11. Greenberg, J.H.: Essays in Linguistics. The University of Chicago Press, Chicago (1957)

    Google Scholar 

  12. Goldsmith, J.: Unsupervised Learning of the Morphology of a Natural Language. Computational Linguistics 27, 153–198 (2001)

    Article  MathSciNet  Google Scholar 

  13. Gelbukh, A., Alexandrov, M., Han, S.Y.: Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds.) CIARP 2004. LNCS, vol. 3287, pp. 432–438. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  14. Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Urrea, A.M., Hlaváčová, J. (2005). Automatic Recognition of Czech Derivational Prefixes. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30586-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30586-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24523-0

  • Online ISBN: 978-3-540-30586-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics