Abstract
Speech intelligibility for voice rehabilitation can successfully be evaluated by automatic prosodic analysis. In this paper, the influence of reading errors and the selection of certain words (nouns only, nouns and verbs, beginning of each sentence, beginnings of sentences and subclauses) for the computation of the word accuracy (WA) and prosodic features are examined. 73 hoarse patients read the German version of the text “The North Wind and the Sun”. Their intelligibility was evaluated perceptually by 5 trained experts according to a 5-point scale. Combining prosodic features and WA by Support Vector Regression showed human-machine correlations of up to \(r=0.86\). They drop for files with few reading errors, however, but this can largely be evened out by feature set adjustment. WA should be computed on the whole text, but for some prosodic features, a subset of words may be sufficient.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hustad, K., Dardis, C., McCourt, K.: Effects of visual information on intelligibility of open and closed class words in predictable sentences produced by speakers with dysarthria. Clin. Linguist. Phon 21, 353–367 (2007)
Cutler, A.: Phonological cues to open- and closed-class words in the processing of spoken sentences. J. Psycholinguist Res. 22, 109–131 (1993)
Grosjean, F., Gee, J.: Prosodic structure and spoken word recognition. Cognition 25, 135–155 (1987)
Pichney, M., Durlach, N., Braida, L.: Speaking clearly for the hard of hearing. II: acoustic characteristics of clear and conversational speech. J. Speech Hear. Res. 29, 434–446 (1986)
Turner, G., Tjaden, K.: Acoustic differences between content and function words in amyotrophic lateral sclerosis. J. Speech Lang. Hear. Res. 43, 769–781 (2000)
Haderlein, T., Schützenberger, A., Döllinger, M., Nöth, E.: Robust automatic evaluation of intelligibility in voice rehabilitation using prosodic analysis. In: Ekštein, K., Matoušek, V. (eds.) TSD 2017. LNCS (LNAI), vol. 10415, pp. 11–19. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-64206-2_2
Haderlein, T., Nöth, E., Maier, A., Schuster, M., Rosanowski, F.: Influence of reading errors on the text-based automatic evaluation of pathologic voices. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2008. LNCS (LNAI), vol. 5246, pp. 325–332. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87391-4_42
Haderlein, T., Döllinger, M., Matoušek, V., Nöth, E.: Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples. Logop. Phoniatr Vocol 41, 106–116 (2016)
International Phonetic Association (IPA): Handbook of the International Phonetic Association. Cambridge University Press, Cambridge (1999)
Maier, A.: Speech of Children with Cleft Lip and Palate: Automatic Assessment. Studien zur Mustererkennung, vol. 29. Logos Verlag, Berlin (2009)
Haderlein, T., Moers, C., Möbius, B., Rosanowski, F., Nöth, E.: Intelligibility rating with automatic speech recognition, prosodic, and cepstral evaluation. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS (LNAI), vol. 6836, pp. 195–202. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23538-2_25
Haderlein, T., Schwemmle, C., Döllinger, M., Matoušek, V., Ptok, M., Nöth, E.: Automatic evaluation of voice quality using text-based laryngograph measurements and prosodic analysis. Comput. Math. Methods Med. 2015, 11 (2015)
Batliner, A., Buckow, J., Niemann, H., Nöth, E., Warnke, V.: The Prosody Module. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 106–121. Springer, Berlin (2000). https://doi.org/10.1007/978-3-662-04230-4_8
Rubenstein, H., Pickett, J.: Intelligibility of words in sentences. J. Acoust. Soc. Am. 30, 670 (1958)
Smola, A., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Acknowledgments
Dr. Döllinger’s contribution was supported by the German Research Foundation (DFG), grant no. DO1247/8-1 (no. 323308998).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Haderlein, T., Schützenberger, A., Döllinger, M., Nöth, E. (2018). Subtext Word Accuracy and Prosodic Features for Automatic Intelligibility Assessment. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2018. Lecture Notes in Computer Science(), vol 11107. Springer, Cham. https://doi.org/10.1007/978-3-030-00794-2_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-00794-2_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00793-5
Online ISBN: 978-3-030-00794-2
eBook Packages: Computer ScienceComputer Science (R0)