Large Vocabulary Continuous Speech Recognition for Estonian Using Morphemes and Classes

Alumäe, Tanel

doi:10.1007/978-3-540-30120-2_31

Large Vocabulary Continuous Speech Recognition for Estonian Using Morphemes and Classes

Tanel Alumäe²¹

Conference paper

881 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Abstract

This paper describes development of a large vocabulary continuous speaker independent speech recognition system for Estonian. Estonian is an agglutinative language and the number of different word forms is very large, in addition, the word order is relatively unconstrained. To achieve a good language coverage, we use pseudo-morphemes as basic units in a statistical trigram language model. To improve language model robustness, we automatically find morpheme classes and interpolate the morpheme model with the class-based model. The language model is trained on a newspaper corpus of 15 million word forms. Clustered triphones with multiple Gaussian mixture components are used for acoustic modeling. The system with interpolated morpheme language model is found to perform significantly better than the baseline word form trigram system in all areas. The word error rate of the best system is 27.7% which is a 10.0% absolute improvement over the baseline system.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Waibel, A., Geutner, P., Tomokiyo, L., Schultz, T., Woszczyna, M.: Multilinguality in speech and spoken language systems. Proceedings of the IEEE 88, 1297–1313 (2000)
Article Google Scholar
Kwon, O.W., Park, J.: Korean large vocabulary continuous speech recognition with morphemebased recognition units. Speech Communication 39, 287–300 (2003)
Article MATH Google Scholar
Szarvas, M., Furui, S.: Evaluation of the stochastic morphosyntactic language model on a one million word Hungarian dictation task. In: Proceedings of EuroSpeech 2003, Geneva (2003)
Google Scholar
Siivola, V., Hirsimäki, T., Creutz, M., Kurimo, M.: Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner. In: Proceedings of EuroSpeech 2003, Geneva (2003)
Google Scholar
Eek, A., Meister, E.: Estonian speech in the BABEL multi-language database: Phoneticphonological problems revealed in the text corpus. In: Proceedings of LP 1998, vol. II, pp. 529–546 (1999)
Google Scholar
Hennoste, T., Kaalep, H.J., Muischnek, K., Paldre, L., Vaino, T.: The Tartu University corpus of Estonian literary language. In: Abstracts. Congressus nonus internationalis fenno-ugristarum. Pars II, Tartu, pp. 338–339 (2000)
Google Scholar
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.2) (2003), http://htk.eng.cam.ac.uk
Kneser, R., Ney, H.: Improved clustering techniques for class-based statistical language modelling. In: Proceedings of the European Conference on Speech Communication and Technology, pp. 973–976 (1993)
Google Scholar
Lee, A., Kawahara, T., Shikano, K.: Julius—an open source real-time large vocabulary recognition engine. In: Proceedings of the European Conference on Speech Communication and Technology (EuroSpeech), pp. 1691–1694 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Phonetics and Speech Technology, Institute of Cybernetics, Tallinn Technical University, Estonia
Tanel Alumäe

Authors

Tanel Alumäe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Informatics, Masaryk University, Brno, Czech Republic
Petr Sojka
Faculty of Informatics, Masaryk University, Botanická 68a, CZ-602 00, Brno, Czech Republic
Ivan Kopeček
Faculty of Informatics, Department of Computer Graphics and Design, Masaryk University, Botanická 68a, 602 00, Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Alumäe, T. (2004). Large Vocabulary Continuous Speech Recognition for Estonian Using Morphemes and Classes. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_31

Download citation

DOI: https://doi.org/10.1007/978-3-540-30120-2_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23049-6
Online ISBN: 978-3-540-30120-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics