Skip to main content

Multi-lingual and Multi-modal Speech Processing and Applications

  • Conference paper
Pattern Recognition (DAGM 2005)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3663))

Included in the following conference series:

  • 1882 Accesses

Abstract

Over the last decade voice technologies for telephony and embedded solutions became much more mature, resulting in applications providing mobile access to digital information from anywhere. Both a growing demand for voice driven applications in many languages and the need for improved usability and user experience now drives the exploration of multi-lingual speech processing techniques for recognition, synthesis and conversational dialog management. In this overview article we discuss our recent activities on multi-lingual voice technologies and describe the benefits of multi-lingual modeling for the creation of multi-modal mobile and telephony applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kunzmann, S.: VoiceType: A Multi-Lingual, Large vocabulary Speech Recognition System for a PC. In: Proceedings of the 2nd SQEL Workshop on Multi-Lingual Information Retrieval Dialogs, Pilsen (1997)

    Google Scholar 

  2. Kunzmann, S.: Applied Speech Processing Technologies – our Journey. European Language Resources Association Newsletter, Paris (2000)

    Google Scholar 

  3. Kunzmann, S., Fischer, V., Gonzalez, J., Emam, O., Günther, C., Janke, E.: Multilingual Acoustic Models for Speech Recognition and Synthesis. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Montreal (2004)

    Google Scholar 

  4. Wells, C.J.: Computer Coded Phonemic Notation of Individual Languages of the European Community. Journal of the International Phonetic Association 19, 32–54 (1989)

    Article  MathSciNet  Google Scholar 

  5. Schultz, T., Waibel, A.: Language Independent and Language Adaptive Acoustic Modeling for Speech Recognition. Speech Communications 35 (2001)

    Google Scholar 

  6. Fischer, V., Janke, E., Kunzmann, S.: Likelihood Combination and Recognition Output Voting for the Decoding of Non-Native Speech with Multilingual HMMs. In: Proc. of the 7th Int. Conference on Spoken Language Processing, Denver (2002)

    Google Scholar 

  7. Fischer, V., Gonzalez, J., Janke, E., Villani, M., Waast-Richard, C.: Towards Multilingual Acoustic Modeling for Large Vocabulary Speech Recognition. In: Proc. of the IEEE Workshop on Multilingual Speech Communications, Kyoto (2000)

    Google Scholar 

  8. Mast, M., Roß, T., Schulz, H., Harrikari, H.: Different Approaches to Build Multilingual Conversational Systems. In: Proc. of the 5th International Conference on Text, Speech and Dialogue, Brno, Czech Republic (2002)

    Google Scholar 

  9. Ostendorf, M., Bulyko, I.: The Impact of Speech Recognition on Speech Synthesis. In: Proc. of the IEEE 2002 Workshop on Speech Synthesis, Santa Monica, CA (2002)

    Google Scholar 

  10. Sproat, R.: Multilingual Text-to-Speech Synthesis. In: The Bell Labs Approach. Kluwer Academic Publishers, Dordrecht (1998)

    Google Scholar 

  11. Hoffmann, R., Jokisch, O., Hirschfeld, D., Kruschke, H., Kordon, U., Koloska, U.: A Multilingual TTS System with less than 1 Mbyte Footprint fro Embedded Applications. In: Proc. of the IEEE Int. Conference on Acoustics, Speech, and Signal Processing, Hong Kong (2003)

    Google Scholar 

  12. Mayfield Tomokiyo, L., Black, A., Lenzo, K.: Arabic in my Hand: Small-footprint Synthesis of Egyptian Arabic. In: Proc. of the 8th European Conf. on Speech Communication and Technology, Geneva (2003)

    Google Scholar 

  13. Eide, E., Aaron, A., Bakis, R., Cohen, P., Donovan, R., Hamza, W., Mathes, T., Picheny, M., Polkosky, M., Smith, M., Viswanathan, M.: Recent Improvements to the IBM Trainable Speech Synthesos System. In: Proc. of the IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Hong Kong (2003)

    Google Scholar 

  14. Romsdorfer, H., Pfister, B.: Multi-Context Rules for Phonological Processing in Polyglott TTS Synthesis. In: Proc. of the 8th Int. Conf. on Spoken Language Processing, Jeju Island, Korea (2004)

    Google Scholar 

  15. Marcadet, J.C., Fischer, V., Waast-Richard, C.: A Transformation-Based Learning Approach To Language Identification For Mixed-Lingual Text-To-Speech Synthesis. In: Proc. of the 9th European Conf. on Speech Communication and Technology, Lisbon (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ivanecky, J., Fischer, J., Mast, M., Kunzmann, S., Ross, T., Fischer, V. (2005). Multi-lingual and Multi-modal Speech Processing and Applications. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition. DAGM 2005. Lecture Notes in Computer Science, vol 3663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550518_19

Download citation

  • DOI: https://doi.org/10.1007/11550518_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28703-2

  • Online ISBN: 978-3-540-31942-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics