Skip to main content

Automatic Recognition and Evaluation of Tracheoesophageal Speech

  • Conference paper
Text, Speech and Dialogue (TSD 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3206))

Included in the following conference series:

Abstract

Tracheoesophageal (TE) speech is a possibility to restore the ability to speak after laryngectomy, i.e. the removal of the larynx. TE speech often shows low audibility and intelligibility which also makes it a challenge to automatic speech recognition. We improved the recognition results by adapting a speech recognizer trained on normal, non-pathologic voices to single TE speakers by unsupervised HMM interpolation.

In speech rehabilitation the patient’s voice quality has to be evaluated. As no objective classification means exists until now and an automation of this procedure is desirable we performed initial experiments for automatic evaluation of the intelligibility. We compared scoring results for TE speech from five experienced raters with the word accuracy from different types of speech recognizers. Correlation coefficients of about –0.8 are promising for future work.

This work was partly funded by the EU in the project PF-STAR under grant IST-2001-37599. The responsibility for the contents of this study lies with the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Schutte, H.K., Nieboer, G.J.: Aerodynamics of esophageal voice production with and without a Groningen voice prosthesis. Filia Phoniatr Logop 54, 8–18 (2002)

    Article  Google Scholar 

  2. Robbins, J., Fisher, H.B., Blom, E.C., Singer, M.I.: A Comparative Acoustic Study of Normal, Esophageal, and Tracheoesophageal Speech Production. Journal of Speech and Hearing Disorders 49, 202–210 (1984)

    Google Scholar 

  3. Bellandese, M.H., Lerman, J.W., Gilbert, H.R.: An Acoustic Analysis of Excellent Female Esophageal, Tracheoesophageal, and Laryngeal Speakers. Journal of Speech, Language, and Hearing Research 44, 1315–1320 (2001)

    Article  Google Scholar 

  4. Gandour, J., Weinberg, B.: Perception of Intonational Contrasts in Alaryngeal Speech. Journal of Speech and Hearing Research 26, 142–148 (1983)

    Google Scholar 

  5. Searl, J.P., Carpenter, M.A.: Acoustic Cues to the Voicing Feature in Tracheoesophageal Speech. Journal of Speech, Language, and Hearing Research 45, 282–294 (2002)

    Article  Google Scholar 

  6. Lohscheller, J.: Dynamics of the Laryngectomee Substitute Voice Production. Ph.D. thesis, Shaker-Verlag, Aachen, Germany (2003)

    Google Scholar 

  7. Stemmer, G.: Modeling Variability in Speech Recognition. Ph.D. thesis, Chair for Pattern Recognition, University of Erlangen-Nuremberg, Germany (2004)

    Google Scholar 

  8. Wahlster, W. (ed.): Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin (2000)

    MATH  Google Scholar 

  9. Steidl, S., Stemmer, G., Hacker, C., Nöth, E., Niemann, H.: Improving Children’s Speech Recognition by HMMInterpolation with an Adults’ Speech Recognizer. In: Michaelis, B., Krell, G. (eds.) DAGM 2003. LNCS, vol. 2781, pp. 600–607. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  10. Jelinek, F., Mercer, R.: Interpolated estimation of markov source parameters from sparse data. In: Gelesma, E.S., Kanal, L.N. (eds.) Proc. Workshop on Pattern Recognition in Practice, pp. 381–397. North-Holland, Amsterdam (1980)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Haderlein, T., Steidl, S., Nöth, E., Rosanowski, F., Schuster, M. (2004). Automatic Recognition and Evaluation of Tracheoesophageal Speech. In: Sojka, P., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2004. Lecture Notes in Computer Science(), vol 3206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30120-2_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30120-2_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-23049-6

  • Online ISBN: 978-3-540-30120-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics