Skip to main content

Language Model Speaker Adaptation for Transcription of Slovak Parliament Proceedings

  • Conference paper
  • First Online:
Book cover Speech and Computer (SPECOM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9319))

Included in the following conference series:

Abstract

Language model and acoustic model adaptation play an important role in enhancing performance and robustness of automatic speech recognition, especially in the case of domain-specific, gender-dependent, or user-adapted systems development. This paper is oriented on the language model speaker adaptation for transcription of parliament proceedings in Slovak for individual speaker. Based on the current research studies, we have developed a framework combining multiple speech recognition outputs with acoustic and language model adaptation at different stages. The preliminary results show a significant decrease in the model perplexity from 45 % to 74 % relatively and the speech recognition word error rate from 29 % to 43 %, for male and female speakers respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.nrsr.sk/dl/.

References

  1. Rusko, M., et al.: Slovak automatic dictation system for judicial domain. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNCS, vol. 8387, pp. 16–27. Springer, Heidelberg (2014)

    Google Scholar 

  2. Niesler, T., Willett, D.: Unsupervised language model adaptation for lecture speech transcription. In: Proceedings of ICSLP 2002, pp. 1413–1416 (2002)

    Google Scholar 

  3. Nanjo, H., Kawahara, T.: Language model and speaking rate adaptation for spontaneous presentation speech recognition. IEEE Trans. Speech Audio Process. 12(4), 391–400 (2004)

    Article  Google Scholar 

  4. Hsu, B.-J., Glass, J.: Language model parameter estimation using user transcriptions. In: Proceedings of ICASSP 2009, Taipei, Taiwan, pp. 4805–4808 (2009)

    Google Scholar 

  5. Ariki, Y., et al.: Live speech recognition in sports games by adaptation of acoustic and language model. In: Proceedings of EUROSPEECH 2003, pp. 1453–1456 (2003)

    Google Scholar 

  6. Chen, L., Gauvain, J.-L., Lamel, L., Adda, G.: Dynamic language modeling for broadcast news. In: Proceedings of ICSLP 2004, Jeju Island, Korea, pp. 997–1000 (2004)

    Google Scholar 

  7. Cerva, P., Nouza, J., Kolorenc, J., David, P.: Improved transcription of Czech parliament speeches by acoustic and language model adaptation. In: Proceedings of SPECOM 2006, St. Petersburg, Russia, pp. 103–106 (2006)

    Google Scholar 

  8. Tur, G., Stolcke, A.: Unsupervised language model adaptation for meeting recognition. In: Proceedings of ICASSP 2007, Honolulu, Hawaii, USA, pp. IV-173–IV-176 (2007)

    Google Scholar 

  9. Vergyri, D., Stolcke, A., Tur, G.: Exploiting user feedback for language model adaptation in meeting recognition. In: Proceedings of ICASSP 2009, pp. 4737–4740 (2009)

    Google Scholar 

  10. Besling, S., Meier, H.-G.: Language model speaker adaptation. In: Proceedings of EUROSPEECH 1995, Madrid, Spain, pp. 1755–1758 (1995)

    Google Scholar 

  11. Klakow, D.: Language model adaptation for tiny adaptation corpora. In: Proceedings of INTERSPEECH 2006, Pittsburgh, PA, USA, pp. 2214–2217 (2006)

    Google Scholar 

  12. Kneser, R., Peters, J., Klakow, D.: Language model adaptation using dynamic marginals. In: Proceedings of EUROSPEECH 1997, Rhodes, Greece, pp. 1971–1974 (1997)

    Google Scholar 

  13. Bacchiani, M., Roark, B.: Unsupervised language model adaptation. In: Proceedings of ICASSP 2003, Hong Kong, China, pp. I-224–I-227 (2003)

    Google Scholar 

  14. Staš, J., Juhár, J., Hládek, D.: Classification of heterogeneous text data for robust domain-specific language modeling. EURASIP J. Audio Speech Music Process. 2014(14), 12 (2014)

    Google Scholar 

  15. Stolcke, A.: SRILM - an extensible language modeling toolkit. In: Proceedings of ICSLP 2002, Denver, Colorado, USA, pp. 901–904 (2002)

    Google Scholar 

  16. Lee, A., Kawahara, T., Shikano, K.: Julius - an open source real-time large vocabulary recognition engine. In: Proceedings of EUROSPEECH 2001, Aalborg, Denmark, pp. 1691–1694 (2001)

    Google Scholar 

  17. Fiscus, J.G.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: Proceedings of IEEE ASRU Workshop, Santa Barbara, CA, USA, pp. 347–354 (1997)

    Google Scholar 

  18. Lojka, M., Juhár, J.: Hypothesis combination for Slovak dictation speech recognition. In: Proceedings of 56th International Symposium on ELMAR 2014, Zadar, Croatia, pp. 43–46 (2014)

    Google Scholar 

Download references

Acknowledgments

The research presented in this paper was supported by the Ministry of Education, Science, Research and Sport of the Slovak Republic under the project VEGA 1/0075/15 (50 %) and the Research and Development Operational Programme funded by the ERDF under the project ITMS: 26220220182 (50 %).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ján Staš .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Staš, J., Hládek, D., Juhár, J. (2015). Language Model Speaker Adaptation for Transcription of Slovak Parliament Proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23132-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23131-0

  • Online ISBN: 978-3-319-23132-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics