Skip to main content

Challenges in Speech Processing of Slavic Languages (Case Studies in Speech Recognition of Czech and Slovak)

  • Chapter
Development of Multimodal Interfaces: Active Listening and Synchrony

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

Slavic languages pose a big challenge for researchers dealing with speech technology. They exhibit a large degree of inflection, namely declension of nouns, pronouns and adjectives, and conjugation of verbs. This has a large impact on the size of lexical inventories in these languages, and significantly complicates the design of text-to-speech and, in particular, speech-to-text systems. In the paper, we demonstrate some of the typical features of the Slavic languages and show how they can be handled in the development of practical speech processing systems. We present our solutions we applied in the design of voice dictation and broadcast speech transcription systems developed for Czech. Furthermore, we demonstrate how these systems can be converted to another similar Slavic language, in our case Slovak. All the presented systems operate in real time with very large vocabularies (350K words in Czech, 170K words in Slovak) and some of them have been already deployed in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. http://www.research.ibm.com/hlt/html/body_history.html

  2. Gauvain, J.L., Lamel, L., Adda, G., Jardino, M.: The LIMSI 1998 HUB-4E Transcription System. In: Proc. of the DARPA Broadcast News Workshop, Herndon, pp. 99–104 (1999)

    Google Scholar 

  3. Os, E., Boves, L., Lamel, L., Baggia, P.: Overview of the ARISE Project. In: Proceedings of Eurospeech 1999, Budapest, pp. 1527–1530 (1999)

    Google Scholar 

  4. Tan, Z.-H., Lindberg, B. (eds.): Automatic speech recognition on mobile devices and over communication networks. Springer, London (2008)

    MATH  Google Scholar 

  5. Tronconi, A., Billi, M.: New technologies for physically disabled individuals. European Transactions on Telecommunications (6), 633–640 (2008)

    Google Scholar 

  6. Hajic, J.: Disambiguation of Rich Inflection-Computational Morphology of Czech. Karolinum Charles University Press, Prague (2004)

    Google Scholar 

  7. Nejedlova, D., Nouza, J.: Building of a Vocabulary for the Automatic Voice-Dictation System. In: Matoušek, V., Mautner, P. (eds.) TSD 2003. LNCS (LNAI), vol. 2807, pp. 301–308. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  8. Nouza, J., Zdansky, J., David, P., Cerva, P., Kolorenc, J., Nejedlova, D.: Fully Automated System for Czech Spoken Broadcast Transcription with Very Large (300K+) Lexicon. In: Proc. of Interspeech 2005, Lisbon (September 2005)

    Google Scholar 

  9. Hirsimäki, T., Creutz, M., Siivola, V., Kurimo, M., Virpioja, S., Pylkkönen, J.: Unlimited Vocabulary Speech Recognition with Morph Language Models Applied to Finnish. Computer Speech & Language 20(4), 515–541 (2006)

    Article  Google Scholar 

  10. Byrne, W., Hajic, J., Ircing, P., Krbec, P., Psutka, J.: Morpheme Based Language Models for Speech Recognition of Czech. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2000. LNCS (LNAI), vol. 1902, pp. 139–162. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  11. Kolorenc, J., Nouza, J., Cerva, P.: Multi-words in the Czech TV/radio News Transcription system. In: Proc. of Specom 2006 conference, St. Petersburg, pp. 70–74 (2006)

    Google Scholar 

  12. Nouza, J., Psutka, J., Uhlir, J.: Phonetic Alphabet for Speech Recognition of Czech. Radioengineering 6(4), 16–20 (1997)

    Google Scholar 

  13. Cerva, P., Nouza, J.: Supervised and unsupervised speaker adaptation in large vocabulary continuous speech recognition of Czech. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 203–210. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  14. Nouza, J.: Strategies for developing a real-time continuous speech recognition system for czech language. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 189–196. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  15. Nouza, J., Drabkova, J.: Combining Lexical and Morphological knowledge in language model for Inflectional (Czech) Language. In: Proc. of 6th Int. Conference on Spoken Language Processing (ICSLP 2002), Denver, September 2002, pp. 705–708 (2002)

    Google Scholar 

  16. Nouza, J., Zdansky, J., Cerva, P., Kolorenc, J.: Continual On-line Monitoring of Czech Spoken Broadcast Programs. In: Proc. of 7th International Conference on Spoken Language Processing (ICSLP 2006), Pittsburgh, September 2006, pp. 1650–1653 (2006)

    Google Scholar 

  17. Nouza, J.: Discrete and Fluent Voice Dictation in Czech Language. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 273–280. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  18. Cerva, P., Nouza, J.: Design and Development of Voice Controlled Aids for Motor-Handicapped Persons. In: Proc. of Interspeech, Antwerp, pp. 2521–2524 (2007)

    Google Scholar 

  19. http://www.v2t.cz/newton-media.php

  20. Nouza, J., Zdansky, J., Cerva, P., Kolorenc, J.: A system for information retrieval from large records of czech spoken data. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 401–408. Springer, Heidelberg (2006)

    Google Scholar 

  21. Chaloupka, J.: Visual Speech Segmentation and Speaker Recognition for Transcription of TV News. In: Proc. of Interspeech 2006, Denver, September 2006, pp. 1284–1287 (2006)

    Google Scholar 

  22. Callejas, Z., Nouza, J., Cerva, P., López-Cózar, R.: Cost-efficient cross-lingual adaptation of a speech recognition system. In: Advances in Intelligent and Soft Computing. Springer, Heidelberg (2009)

    Google Scholar 

  23. Ivanecky, J.: Automatic speech transcription and segmentation. PhD thesis, Kosice (December 2003) (in Slovak)

    Google Scholar 

  24. Nouza, J., Silovsky, J., Zdansky, J., Cerva, P., Kroul, M., Chaloupka, J.: Czech-to-Slovak Adapted Broadcast News Transcription System. In: Proc. of Interspeech 2008, Brisbane, September 2008, pp. 2683–2686 (2008)

    Google Scholar 

  25. Rotovnik, T., Sepesy Maucec, M., Kacic, Z.: Large vocabulary continuous speech recognition of an inflected language using stems and endings. Speech Communication 49(6), 437–452 (2007)

    Article  Google Scholar 

  26. Pleva, M., Cizmar, A., Juhár, J., Ondas, J., Michal, M.: Towards Slovak Broadcast News Automatic Recording and Transcribing Service. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds.) HH and HM Interaction. LNCS (LNAI), vol. 5042, pp. 158–168. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  27. Korzinek, D., Brocki, L.: Grammar Based Automatic Speech Recognition System for the Polish Language. In: Recent Advances in Mechatronics. Springer, Heidelberg (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Nouza, J., Zdansky, J., Cerva, P., Silovsky, J. (2010). Challenges in Speech Processing of Slavic Languages (Case Studies in Speech Recognition of Czech and Slovak). In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics