Abstract
In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Boháč, M., Blavka, K.: Text-to-speech alignment for imperfect transcriptions. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 536–543. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40585-3_67
Kawahara, T.: Transcription system using automatic speech recognition for the Japanese parliament (diet). In: Proceedings of IAAI, pp. 2224–2228 (2012)
Makhoul, J., Kubala, F., Leek, T., Liu, D., Nguyen, L., Schwartz, R., Srivastava, A.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)
Marasek, K., Koržinek, D., Brocki, Ł.: System for automatic transcription of sessions of the Polish senate. Arch. Acoust. 39(4), 501–509 (2014)
Neves, L., Martins, C., Meinedo, H., Neto, J.: Domain adaptation of a broadcast news transcription system for the Portuguese parliament. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 163–171. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85980-2_17
Nouza, J., Safarik, R., Cerva, P.: ASR for south Slavic languages developed in almost automated way. In: Proceedings of Interspeech, pp. 3868–3872 (2016)
Pražák, A., Psutka, J.V., Hoidekr, J., Kanis, J., Müller, L., Psutka, J.: Automatic online subtitling of the Czech parliament meetings. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS, vol. 4188, pp. 501–508. Springer, Heidelberg (2006). doi:10.1007/11846406_63
Safarik, R., Nouza, J.: Methods for rapid development of automatic speech recognition system for Russian. In: Proceedings of the IEEE Workshop ECMSM, pp. 1–6 (2015)
Schultz, T.: Globalphone: a multilingual speech and text database developed at Karlsruhe university. In: Proceedings of Interspeech, pp. 345–348 (2002)
Staš, J., Hládek, D., Juhár, J.: Language model speaker adaptation for transcription of Slovak parliament proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 259–267. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_32
Stüker, S., Fügen, C., Kraft, F., Wölfel, M.: The ISL 2007 English speech transcription system for European parliament speeches. In: Proceedings of the Interspeech, pp. 2609–2612 (2007)
Vu, N.T., Schlippe, T., Kraus, F., Schultz, T.: Rapid bootstrapping of five Eastern European languages using the rapid language adaptation toolkit. In: Proceedings of the Interspeech, pp. 865–868 (2010)
Zgank, A., Rotovnik, T., Grasic, M., Kos, M., Vlaj, D., Kacic, Z.: Sloparl - Slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition. In: Proceedings of the Interspeech, pp. 197–200 (2006)
Acknowledgements
The research was supported by the Technology Agency of the Czech Republic (project TA04010199) and by the Student Grant Scheme at the Technical University of Liberec.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Nouza, J., Safarik, R. (2017). Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-64206-2_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64205-5
Online ISBN: 978-3-319-64206-2
eBook Packages: Computer ScienceComputer Science (R0)