Skip to main content
Log in

A statistical tagger for morphological tagging of Russian language texts

  • Topical Issue
  • Published:
Automation and Remote Control Aims and scope Submit manuscript

Abstract

We consider a method of constructing a statistical tagger for automated morphological tagging for Russian language texts. In this method, each word is assigned with a tag that contains information about the part of speech and a full set of the word’s morphological characteristics. We employ the set of morphological characteristics used in the SynTagRus corpus whose material has been used to train the tagger. The tagger is based on the SVM (Support Vector Machine) approach. The developed tagger has proven to be efficient and has shown high tagging quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Apresyan, Yu.D., Boguslavskii, I.M., Iomdin, L.L., et al., Lingvisticheskoe obespechenie sistemy ETAP-2 (Linguistic Software for the STAGE-2 System), Moscow: Nauka, 1989.

    Google Scholar 

  2. Gimenez, J. and Marquez, L., SVMTool: A General POS Tagger Generator Based on Support Vector Machines, Proc. 4 Int. Conf. Language Resourc. Evaluat. (LREC’04), Lisbon, Portugal, 2004, pp. 43–46.

    Google Scholar 

  3. Joachims, T., Making Large-Scale SVM Learning Practical, in Advances in Kernel Methods—Support Vector Learning, Schölkopf, B., Burges, C., and Smola, A., Eds., Cambridge: MIT Press, 1999, pp. 169–184.

    Google Scholar 

  4. Kazennikov, A.O., Using Finite Automata for Morphological Analysis and Synthesis Based on the Dictionaries of the STAGE System, Sb. tr. konf. molodykh uchenykh i spetsialistov ITIS (Proc. Conf. Young Scientists and Specialists of ITIS), 2008, pp. 201–205.

    Google Scholar 

  5. Chang, C.-C. and Lin, C.-J., LIBSVM: A Library for Support Vector Machines, ACM Trans. Intelligent Syst. Technol., 2011, vol. 2, no. 27, pp. 1–27.

    Article  Google Scholar 

  6. Fan, R.-E., Chang, K.-W., Hsieh, C.-J., et al., LIBLINEAR: A Library for Large Linear Classification, J. Machine Learning Res., 2008, vol. 9, pp. 1871–1874.

    MATH  Google Scholar 

  7. Shi, Q., Petterson, J., Dror, G., et al., Hash Kernels for Structured Data, J. Machine Learning, 2009, vol. 10, pp. 2615–2637.

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

Original Russian Text © V.V. Petrochenkov, A.O. Kazennikov, 2013, published in Avtomatika i Telemekhanika, 2013, No. 10, pp. 154–165.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Petrochenkov, V.V., Kazennikov, A.O. A statistical tagger for morphological tagging of Russian language texts. Autom Remote Control 74, 1724–1732 (2013). https://doi.org/10.1134/S0005117913100123

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S0005117913100123

Keywords

Navigation