Abstract
Language model and acoustic model adaptation play an important role in enhancing performance and robustness of automatic speech recognition, especially in the case of domain-specific, gender-dependent, or user-adapted systems development. This paper is oriented on the language model speaker adaptation for transcription of parliament proceedings in Slovak for individual speaker. Based on the current research studies, we have developed a framework combining multiple speech recognition outputs with acoustic and language model adaptation at different stages. The preliminary results show a significant decrease in the model perplexity from 45 % to 74 % relatively and the speech recognition word error rate from 29 % to 43 %, for male and female speakers respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
References
Rusko, M., et al.: Slovak automatic dictation system for judicial domain. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNCS, vol. 8387, pp. 16–27. Springer, Heidelberg (2014)
Niesler, T., Willett, D.: Unsupervised language model adaptation for lecture speech transcription. In: Proceedings of ICSLP 2002, pp. 1413–1416 (2002)
Nanjo, H., Kawahara, T.: Language model and speaking rate adaptation for spontaneous presentation speech recognition. IEEE Trans. Speech Audio Process. 12(4), 391–400 (2004)
Hsu, B.-J., Glass, J.: Language model parameter estimation using user transcriptions. In: Proceedings of ICASSP 2009, Taipei, Taiwan, pp. 4805–4808 (2009)
Ariki, Y., et al.: Live speech recognition in sports games by adaptation of acoustic and language model. In: Proceedings of EUROSPEECH 2003, pp. 1453–1456 (2003)
Chen, L., Gauvain, J.-L., Lamel, L., Adda, G.: Dynamic language modeling for broadcast news. In: Proceedings of ICSLP 2004, Jeju Island, Korea, pp. 997–1000 (2004)
Cerva, P., Nouza, J., Kolorenc, J., David, P.: Improved transcription of Czech parliament speeches by acoustic and language model adaptation. In: Proceedings of SPECOM 2006, St. Petersburg, Russia, pp. 103–106 (2006)
Tur, G., Stolcke, A.: Unsupervised language model adaptation for meeting recognition. In: Proceedings of ICASSP 2007, Honolulu, Hawaii, USA, pp. IV-173–IV-176 (2007)
Vergyri, D., Stolcke, A., Tur, G.: Exploiting user feedback for language model adaptation in meeting recognition. In: Proceedings of ICASSP 2009, pp. 4737–4740 (2009)
Besling, S., Meier, H.-G.: Language model speaker adaptation. In: Proceedings of EUROSPEECH 1995, Madrid, Spain, pp. 1755–1758 (1995)
Klakow, D.: Language model adaptation for tiny adaptation corpora. In: Proceedings of INTERSPEECH 2006, Pittsburgh, PA, USA, pp. 2214–2217 (2006)
Kneser, R., Peters, J., Klakow, D.: Language model adaptation using dynamic marginals. In: Proceedings of EUROSPEECH 1997, Rhodes, Greece, pp. 1971–1974 (1997)
Bacchiani, M., Roark, B.: Unsupervised language model adaptation. In: Proceedings of ICASSP 2003, Hong Kong, China, pp. I-224–I-227 (2003)
Staš, J., Juhár, J., Hládek, D.: Classification of heterogeneous text data for robust domain-specific language modeling. EURASIP J. Audio Speech Music Process. 2014(14), 12 (2014)
Stolcke, A.: SRILM - an extensible language modeling toolkit. In: Proceedings of ICSLP 2002, Denver, Colorado, USA, pp. 901–904 (2002)
Lee, A., Kawahara, T., Shikano, K.: Julius - an open source real-time large vocabulary recognition engine. In: Proceedings of EUROSPEECH 2001, Aalborg, Denmark, pp. 1691–1694 (2001)
Fiscus, J.G.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: Proceedings of IEEE ASRU Workshop, Santa Barbara, CA, USA, pp. 347–354 (1997)
Lojka, M., Juhár, J.: Hypothesis combination for Slovak dictation speech recognition. In: Proceedings of 56th International Symposium on ELMAR 2014, Zadar, Croatia, pp. 43–46 (2014)
Acknowledgments
The research presented in this paper was supported by the Ministry of Education, Science, Research and Sport of the Slovak Republic under the project VEGA 1/0075/15 (50 %) and the Research and Development Operational Programme funded by the ERDF under the project ITMS: 26220220182 (50 %).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Staš, J., Hládek, D., Juhár, J. (2015). Language Model Speaker Adaptation for Transcription of Slovak Parliament Proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)