Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems

Nouza, Jan; Safarik, Radek

doi:10.1007/978-3-319-64206-2_20

Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems

Jan Nouza¹⁵ &
Radek Safarik¹⁵

Conference paper
First Online: 29 July 2017

1479 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10415))

Abstract

In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Boháč, M., Blavka, K.: Text-to-speech alignment for imperfect transcriptions. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 536–543. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40585-3_67
Google Scholar
Kawahara, T.: Transcription system using automatic speech recognition for the Japanese parliament (diet). In: Proceedings of IAAI, pp. 2224–2228 (2012)
Google Scholar
Makhoul, J., Kubala, F., Leek, T., Liu, D., Nguyen, L., Schwartz, R., Srivastava, A.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)
Article Google Scholar
Marasek, K., Koržinek, D., Brocki, Ł.: System for automatic transcription of sessions of the Polish senate. Arch. Acoust. 39(4), 501–509 (2014)
Google Scholar
Neves, L., Martins, C., Meinedo, H., Neto, J.: Domain adaptation of a broadcast news transcription system for the Portuguese parliament. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 163–171. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85980-2_17
Chapter Google Scholar
Nouza, J., Safarik, R., Cerva, P.: ASR for south Slavic languages developed in almost automated way. In: Proceedings of Interspeech, pp. 3868–3872 (2016)
Google Scholar
Pražák, A., Psutka, J.V., Hoidekr, J., Kanis, J., Müller, L., Psutka, J.: Automatic online subtitling of the Czech parliament meetings. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS, vol. 4188, pp. 501–508. Springer, Heidelberg (2006). doi:10.1007/11846406_63
Chapter Google Scholar
Safarik, R., Nouza, J.: Methods for rapid development of automatic speech recognition system for Russian. In: Proceedings of the IEEE Workshop ECMSM, pp. 1–6 (2015)
Google Scholar
Schultz, T.: Globalphone: a multilingual speech and text database developed at Karlsruhe university. In: Proceedings of Interspeech, pp. 345–348 (2002)
Google Scholar
Staš, J., Hládek, D., Juhár, J.: Language model speaker adaptation for transcription of Slovak parliament proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 259–267. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_32
Chapter Google Scholar
Stüker, S., Fügen, C., Kraft, F., Wölfel, M.: The ISL 2007 English speech transcription system for European parliament speeches. In: Proceedings of the Interspeech, pp. 2609–2612 (2007)
Google Scholar
Vu, N.T., Schlippe, T., Kraus, F., Schultz, T.: Rapid bootstrapping of five Eastern European languages using the rapid language adaptation toolkit. In: Proceedings of the Interspeech, pp. 865–868 (2010)
Google Scholar
Zgank, A., Rotovnik, T., Grasic, M., Kos, M., Vlaj, D., Kacic, Z.: Sloparl - Slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition. In: Proceedings of the Interspeech, pp. 197–200 (2006)
Google Scholar

Download references

Acknowledgements

The research was supported by the Technology Agency of the Czech Republic (project TA04010199) and by the Student Grant Scheme at the Technical University of Liberec.

Author information

Authors and Affiliations

Institute of Information Technology and Electronics, Technical University of Liberec, Studentska 2, 461 17, Liberec, Czech Republic
Jan Nouza & Radek Safarik

Authors

Jan Nouza
View author publications
You can also search for this author in PubMed Google Scholar
Radek Safarik
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Nouza .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Kamil Ekštein
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nouza, J., Safarik, R. (2017). Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-64206-2_20
Published: 29 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64205-5
Online ISBN: 978-3-319-64206-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics