Abstract
Over the last decade voice technologies for telephony and embedded solutions became much more mature, resulting in applications providing mobile access to digital information from anywhere. Both a growing demand for voice driven applications in many languages and the need for improved usability and user experience now drives the exploration of multi-lingual speech processing techniques for recognition, synthesis and conversational dialog management. In this overview article we discuss our recent activities on multi-lingual voice technologies and describe the benefits of multi-lingual modeling for the creation of multi-modal mobile and telephony applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kunzmann, S.: VoiceType: A Multi-Lingual, Large vocabulary Speech Recognition System for a PC. In: Proceedings of the 2nd SQEL Workshop on Multi-Lingual Information Retrieval Dialogs, Pilsen (1997)
Kunzmann, S.: Applied Speech Processing Technologies – our Journey. European Language Resources Association Newsletter, Paris (2000)
Kunzmann, S., Fischer, V., Gonzalez, J., Emam, O., Günther, C., Janke, E.: Multilingual Acoustic Models for Speech Recognition and Synthesis. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Montreal (2004)
Wells, C.J.: Computer Coded Phonemic Notation of Individual Languages of the European Community. Journal of the International Phonetic Association 19, 32–54 (1989)
Schultz, T., Waibel, A.: Language Independent and Language Adaptive Acoustic Modeling for Speech Recognition. Speech Communications 35 (2001)
Fischer, V., Janke, E., Kunzmann, S.: Likelihood Combination and Recognition Output Voting for the Decoding of Non-Native Speech with Multilingual HMMs. In: Proc. of the 7th Int. Conference on Spoken Language Processing, Denver (2002)
Fischer, V., Gonzalez, J., Janke, E., Villani, M., Waast-Richard, C.: Towards Multilingual Acoustic Modeling for Large Vocabulary Speech Recognition. In: Proc. of the IEEE Workshop on Multilingual Speech Communications, Kyoto (2000)
Mast, M., Roß, T., Schulz, H., Harrikari, H.: Different Approaches to Build Multilingual Conversational Systems. In: Proc. of the 5th International Conference on Text, Speech and Dialogue, Brno, Czech Republic (2002)
Ostendorf, M., Bulyko, I.: The Impact of Speech Recognition on Speech Synthesis. In: Proc. of the IEEE 2002 Workshop on Speech Synthesis, Santa Monica, CA (2002)
Sproat, R.: Multilingual Text-to-Speech Synthesis. In: The Bell Labs Approach. Kluwer Academic Publishers, Dordrecht (1998)
Hoffmann, R., Jokisch, O., Hirschfeld, D., Kruschke, H., Kordon, U., Koloska, U.: A Multilingual TTS System with less than 1 Mbyte Footprint fro Embedded Applications. In: Proc. of the IEEE Int. Conference on Acoustics, Speech, and Signal Processing, Hong Kong (2003)
Mayfield Tomokiyo, L., Black, A., Lenzo, K.: Arabic in my Hand: Small-footprint Synthesis of Egyptian Arabic. In: Proc. of the 8th European Conf. on Speech Communication and Technology, Geneva (2003)
Eide, E., Aaron, A., Bakis, R., Cohen, P., Donovan, R., Hamza, W., Mathes, T., Picheny, M., Polkosky, M., Smith, M., Viswanathan, M.: Recent Improvements to the IBM Trainable Speech Synthesos System. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Hong Kong (2003)
Romsdorfer, H., Pfister, B.: Multi-Context Rules for Phonological Processing in Polyglott TTS Synthesis. In: Proc. of the 8th Int. Conf. on Spoken Language Processing, Jeju Island, Korea (2004)
Marcadet, J.C., Fischer, V., Waast-Richard, C.: A Transformation-Based Learning Approach To Language Identification For Mixed-Lingual Text-To-Speech Synthesis. In: Proc. of the 9th European Conf. on Speech Communication and Technology, Lisbon (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ivanecky, J., Fischer, J., Mast, M., Kunzmann, S., Ross, T., Fischer, V. (2005). Multi-lingual and Multi-modal Speech Processing and Applications. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition. DAGM 2005. Lecture Notes in Computer Science, vol 3663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550518_19
Download citation
DOI: https://doi.org/10.1007/11550518_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28703-2
Online ISBN: 978-3-540-31942-9
eBook Packages: Computer ScienceComputer Science (R0)