Robust Automatic Evaluation of Intelligibility in Voice Rehabilitation Using Prosodic Analysis

  • Tino HaderleinEmail author
  • Anne Schützenberger
  • Michael Döllinger
  • Elmar Nöth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10415)


Speech intelligibility for voice rehabilitation has been successfully evaluated by automatic prosodic analysis. In this paper, the influence of reading errors and the selection of certain words for the computation of prosodic features (nouns only, nouns and verbs, beginning of each sentence, beginnings of sentences and subclauses) are examined. 73 hoarse patients (48.3 ± 16.8 years) read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 trained experts according to a 5-point scale. Eight prosodic features showed human-machine correlations of r \(\ge \) 0.4. The normalized energy in a word-pause-word interval, computed from all words (r = 0.69 for the full speaker set), the mean of jitter in nouns and verbs (r = 0.67), and the pause duration before a word (r = 0.66) were the most robust features. However, reading errors can significantly influence these results.


Intelligibility Automatic assessment Prosody Reading errors 



Dr. Döllinger’s contribution was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft; DFG), grant no. DO1247/8-1.


  1. 1.
    Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The prosody module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). doi: 10.1007/978-3-662-04230-4_8 CrossRefGoogle Scholar
  2. 2.
    Ellis, L., Fucci, D.: Magnitude-estimation scaling of speech intelligibility: effects of listeners’ experience and semantic-syntactic context. Percept. Mot. Skills 73, 295–305 (1991)CrossRefGoogle Scholar
  3. 3.
    Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 195–202. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-23538-2_25 CrossRefGoogle Scholar
  4. 4.
    Haderlein, T., Nöth, E., Batliner, A., Eysholdt, U., Rosanowski, F.: Automatic intelligibility assessment of pathologic speech over the telephone. Logoped. Phoniatr. Vocol. 36, 175–181 (2011)CrossRefGoogle Scholar
  5. 5.
    Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS, vol. 5246, pp. 325–332. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-87391-4_42 CrossRefGoogle Scholar
  6. 6.
    Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods. Med. 2015, 11p. (2015)Google Scholar
  7. 7.
    International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)Google Scholar
  8. 8.
    Kaufmann, R., Obler, L.: Classification of normal reading error types. In: Leong, C., Joshi, R. (eds.) Developmental and Acquired Dyslexia, pp. 149–157. Kluwer Academic Publishers, Dordrecht (1995)CrossRefGoogle Scholar
  9. 9.
    Kempler, D., van Lancker, D.: Effect of speech task on intelligibility in dysarthria: a case study of Parkinson’s disease. Brain Lang. 80, 449–464 (2002)CrossRefGoogle Scholar
  10. 10.
    Kollmeier, B., Wesselkamp, M.: Development and evaluation of a German sentence test for objective and subjective speech intelligibility assessment. J. Acoust. Soc. Am. 102, 2412–2421 (1997)CrossRefGoogle Scholar
  11. 11.
    Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment, Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)Google Scholar
  12. 12.
    Nöth, E., Batliner, A., Kießling, A., Kompe, R., Niemann, H.: Verbmobil: the use of prosody in the linguistic components of a speech understanding system. IEEE Trans. Speech Audio Process. 8, 519–532 (2000)Google Scholar
  13. 13.
    Origlia, A., Alfano, I.: Prosomarker: a prosodic analysis tool based on optimal pitch stylization and automatic syllabification. In: Calzolari, N., et al. (ed.) Proceedings of 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 997–1002 (2012)Google Scholar
  14. 14.
    Rosenberg, A.: Automatic detection and classification of prosodic events. Ph.D. thesis, Columbia University, New York (2009)Google Scholar
  15. 15.
    Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Tino Haderlein
    • 1
    Email author
  • Anne Schützenberger
    • 2
  • Michael Döllinger
    • 2
  • Elmar Nöth
    • 1
  1. 1.Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Lehrstuhl für Informatik 5 (Mustererkennung)ErlangenGermany
  2. 2.Klinikum der Universität Erlangen-Nürnberg, Phoniatrische und pädaudiologische Abteilung in der HNO-KlinikErlangenGermany

Personalised recommendations