Skip to main content

Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10415))

Abstract

In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.sejm.gov.pl/.

  2. 2.

    http://www.duma.gov.ru/.

  3. 3.

    http://www.parliament.bg/tv/.

  4. 4.

    https://gitlab.ite.tul.cz/SpeechLab/EastSlavicTestData.

  5. 5.

    https://gitlab.ite.tul.cz/SpeechLab/SouthSlavicTestData.

References

  1. Boháč, M., Blavka, K.: Text-to-speech alignment for imperfect transcriptions. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 536–543. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40585-3_67

    Google Scholar 

  2. Kawahara, T.: Transcription system using automatic speech recognition for the Japanese parliament (diet). In: Proceedings of IAAI, pp. 2224–2228 (2012)

    Google Scholar 

  3. Makhoul, J., Kubala, F., Leek, T., Liu, D., Nguyen, L., Schwartz, R., Srivastava, A.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)

    Article  Google Scholar 

  4. Marasek, K., Koržinek, D., Brocki, Ł.: System for automatic transcription of sessions of the Polish senate. Arch. Acoust. 39(4), 501–509 (2014)

    Google Scholar 

  5. Neves, L., Martins, C., Meinedo, H., Neto, J.: Domain adaptation of a broadcast news transcription system for the Portuguese parliament. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 163–171. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85980-2_17

    Chapter  Google Scholar 

  6. Nouza, J., Safarik, R., Cerva, P.: ASR for south Slavic languages developed in almost automated way. In: Proceedings of Interspeech, pp. 3868–3872 (2016)

    Google Scholar 

  7. Pražák, A., Psutka, J.V., Hoidekr, J., Kanis, J., Müller, L., Psutka, J.: Automatic online subtitling of the Czech parliament meetings. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS, vol. 4188, pp. 501–508. Springer, Heidelberg (2006). doi:10.1007/11846406_63

    Chapter  Google Scholar 

  8. Safarik, R., Nouza, J.: Methods for rapid development of automatic speech recognition system for Russian. In: Proceedings of the IEEE Workshop ECMSM, pp. 1–6 (2015)

    Google Scholar 

  9. Schultz, T.: Globalphone: a multilingual speech and text database developed at Karlsruhe university. In: Proceedings of Interspeech, pp. 345–348 (2002)

    Google Scholar 

  10. Staš, J., Hládek, D., Juhár, J.: Language model speaker adaptation for transcription of Slovak parliament proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 259–267. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_32

    Chapter  Google Scholar 

  11. Stüker, S., Fügen, C., Kraft, F., Wölfel, M.: The ISL 2007 English speech transcription system for European parliament speeches. In: Proceedings of the Interspeech, pp. 2609–2612 (2007)

    Google Scholar 

  12. Vu, N.T., Schlippe, T., Kraus, F., Schultz, T.: Rapid bootstrapping of five Eastern European languages using the rapid language adaptation toolkit. In: Proceedings of the Interspeech, pp. 865–868 (2010)

    Google Scholar 

  13. Zgank, A., Rotovnik, T., Grasic, M., Kos, M., Vlaj, D., Kacic, Z.: Sloparl - Slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition. In: Proceedings of the Interspeech, pp. 197–200 (2006)

    Google Scholar 

Download references

Acknowledgements

The research was supported by the Technology Agency of the Czech Republic (project TA04010199) and by the Student Grant Scheme at the Technical University of Liberec.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jan Nouza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nouza, J., Safarik, R. (2017). Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64206-2_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64205-5

  • Online ISBN: 978-3-319-64206-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics