Abstract
One of the biggest challenges in the automatic transcription of the historical audio archive of Czech and Czechoslovak radio is bilingualism. Two closely related languages, Czech and Slovak, are mixed in many archive documents. Both were the official languages in former Czechoslovakia (1918-1992) and both were used in media. The two languages are considered similar, although they differ in more than 75 % of their lexical inventories, which complicates automatic speech-to-text conversion. In this paper, we present and objectively measure the difference between the two languages. After that we propose a method suitable for automatic identification of two acoustically and lexically similar languages. It is based on employing 2 size-optimized parallel lexicons and language models. On large test data, we show that the 2 languages can be distinguished with almost 99 % accuracy. Moreover, the language identification module can be easily incorporated into a 2-pass decoding scheme with almost negligible additional computation costs. The proposed method has been employed in the project aimed at the disclosure of Czech and Czechoslovak oral cultural heritage.
Keywords
Download to read the full chapter text
Chapter PDF
References
Nouza, J., Blavka, K., Bohac, M., Cerva, P., Zdansky, J., Silovsky, J., Prazak, J.: Voice Technology to Enable Sophisticated Access to Historical Audio Archive of the Czech Radio. In: Grana, C., Cucchiara, R. (eds.) MM4CH 2011. CCIS, vol. 247, pp. 27–38. Springer, Heidelberg (2012)
Nouza, J., Blavka, K., Zdansky, J., Cerva, P., Silovsky, J., Bohac, M., Chaloupka, J., Kucharova, M., Seps, L.: Large-scale processing, indexing and search system for Czech audio-visual cultural heritage archives. In: IEEE 14th International Workshop on Multimedia Signal Processing (MMSP), pp. 337–342 (2012)
Nouza, J., Silovsky, J., Zdansky, J., Cerva, P., Kroul, M., Chaloupka, J.: Czech-to-Slovak Adapted Broadcast News Transcription System. In: Proc. of Interspeech 2008, Australia, pp. 2683–2686 (2008)
Navratil, J., Zuhlke, W.: An efficient phonotactic-acoustic system for language identification. In: Proc. of ICASSP, Seattle, USA, vol. 2, pp. 781–784 (1998)
Uebler, U.: Multilingual speech recognition in seven languages. Speech Communication 35(1-2), 53–69 (2001)
Kumar, C.S., Wei, F.S.: A Bilingual Speech Recognition system for English and Tamil. In: Proc. of ICICS PCM, pp. 1641–1644 (2003)
Zhang, Q., Pan, J., Yan, Y.: Mandarin-English bilingual speech recognition for real world music retrieval. In: Proc. of ICASSP, Las Vegas, USA, pp. 4253–4256 (2008)
Alabau, V., Martinez, C.D.: A Bilingual Speech Recognition in Two Phonetically Similar Languages. Jordanas en Tecnologia del Habla, Zaragoza, pp. 197–202 (2006)
Zibert, J., Martincic-Ipsic, S., Ipsic, I., Mihelic, F.: Bilingual Speech Recognition of Slovenian and Croatian Weather Forecasts. In: Proc. of EURASIP Conf. on Video/Image Processing and Multimedia Communications, Zagreb, Croatia, pp. 957–960 (2000)
Silovsky, J., Zdansky, J., Nouza, J., Cerva, P., Prazak, J.: Incorporation of the ASR output in speaker segmentation and clustering within the task of speaker diarization of broadcast streams. In: Proc. of IEEE workshop on Multimedia Signal Processing (MMSP), Banff, Canada, pp. 118–123 (2012)
Silovsky, J., Prazak, J.: Speaker Diarization of Broadcast Streams using Two-stage Clustering based on I-vectors and Cosine Distance Scoring. In: Proc. of ICASSP, Kyoto, pp. 4193–4196 (2012)
Cerva, P., Palecek, K., Silovsky, J., Nouza, J.: Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives. In: Proc. of Interspeech 2011, Florence, pp. 2565–2568 (2011)
Kneser, R., Ney, H.: Improved backing-off for m-gram language modeling. In: Proc. of IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Detroit, pp. 181–184 (1995)
Chaloupka, J., Nouza, J., Kucharova, M.: Using Various Types of Multimedia Resources to Train System for Automatic Transcription of Czech Historical Oral Archives. In: Petrosino, A., Maddalena, L., Pala, P. (eds.) ICIAP 2013 Workshop. LNCS, vol. 8158, pp. 228–237. Springer, Heidelberg (2013)
Brümmer, N., et al.: Description and analysis of the Brno276 system for LRE2011. In: Proc. of Speaker Odyssey Workshop, Singapur, pp. 216–223 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nouza, J., Cerva, P., Silovsky, J. (2013). Dealing with Bilingualism in Automatic Transcription of Historical Archive of Czech Radio. In: Petrosino, A., Maddalena, L., Pala, P. (eds) New Trends in Image Analysis and Processing – ICIAP 2013. ICIAP 2013. Lecture Notes in Computer Science, vol 8158. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41190-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-41190-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41189-2
Online ISBN: 978-3-642-41190-8
eBook Packages: Computer ScienceComputer Science (R0)