Advertisement

Parliament Archives Used for Automatic Training of Multi-lingual Automatic Speech Recognition Systems

  • Jan NouzaEmail author
  • Radek Safarik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10415)

Abstract

In the paper we present a fully automated process capable of creating speech databases needed for training acoustic models for speech recognition systems. We show that archives of national parliaments are perfect sources of speech and text data suited for a lightly supervised training scheme, which does not require human intervention. We describe the process and its procedures in details and demonstrate its usage on three Slavic languages (Polish, Russian and Bulgarian). Practical evaluation is done on a broadcast news task and yields better results than those obtained on some established speech databases.

Keywords

Speech recognition Cross-lingual bootstrapping Parliament speech 

Notes

Acknowledgements

The research was supported by the Technology Agency of the Czech Republic (project TA04010199) and by the Student Grant Scheme at the Technical University of Liberec.

References

  1. 1.
    Boháč, M., Blavka, K.: Text-to-speech alignment for imperfect transcriptions. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 536–543. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40585-3_67 Google Scholar
  2. 2.
    Kawahara, T.: Transcription system using automatic speech recognition for the Japanese parliament (diet). In: Proceedings of IAAI, pp. 2224–2228 (2012)Google Scholar
  3. 3.
    Makhoul, J., Kubala, F., Leek, T., Liu, D., Nguyen, L., Schwartz, R., Srivastava, A.: Speech and language technologies for audio indexing and retrieval. Proc. IEEE 88(8), 1338–1353 (2000)CrossRefGoogle Scholar
  4. 4.
    Marasek, K., Koržinek, D., Brocki, Ł.: System for automatic transcription of sessions of the Polish senate. Arch. Acoust. 39(4), 501–509 (2014)Google Scholar
  5. 5.
    Neves, L., Martins, C., Meinedo, H., Neto, J.: Domain adaptation of a broadcast news transcription system for the Portuguese parliament. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 163–171. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-85980-2_17 CrossRefGoogle Scholar
  6. 6.
    Nouza, J., Safarik, R., Cerva, P.: ASR for south Slavic languages developed in almost automated way. In: Proceedings of Interspeech, pp. 3868–3872 (2016)Google Scholar
  7. 7.
    Pražák, A., Psutka, J.V., Hoidekr, J., Kanis, J., Müller, L., Psutka, J.: Automatic online subtitling of the Czech parliament meetings. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS, vol. 4188, pp. 501–508. Springer, Heidelberg (2006). doi: 10.1007/11846406_63 CrossRefGoogle Scholar
  8. 8.
    Safarik, R., Nouza, J.: Methods for rapid development of automatic speech recognition system for Russian. In: Proceedings of the IEEE Workshop ECMSM, pp. 1–6 (2015)Google Scholar
  9. 9.
    Schultz, T.: Globalphone: a multilingual speech and text database developed at Karlsruhe university. In: Proceedings of Interspeech, pp. 345–348 (2002)Google Scholar
  10. 10.
    Staš, J., Hládek, D., Juhár, J.: Language model speaker adaptation for transcription of Slovak parliament proceedings. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS, vol. 9319, pp. 259–267. Springer, Cham (2015). doi: 10.1007/978-3-319-23132-7_32 CrossRefGoogle Scholar
  11. 11.
    Stüker, S., Fügen, C., Kraft, F., Wölfel, M.: The ISL 2007 English speech transcription system for European parliament speeches. In: Proceedings of the Interspeech, pp. 2609–2612 (2007)Google Scholar
  12. 12.
    Vu, N.T., Schlippe, T., Kraus, F., Schultz, T.: Rapid bootstrapping of five Eastern European languages using the rapid language adaptation toolkit. In: Proceedings of the Interspeech, pp. 865–868 (2010)Google Scholar
  13. 13.
    Zgank, A., Rotovnik, T., Grasic, M., Kos, M., Vlaj, D., Kacic, Z.: Sloparl - Slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition. In: Proceedings of the Interspeech, pp. 197–200 (2006)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Institute of Information Technology and ElectronicsTechnical University of LiberecLiberecCzech Republic

Personalised recommendations