Abstract
Automatic pronunciation scoring makes novel applications for computer assisted language learning possible. In this paper we concentrate on the feature extraction. A relatively large feature vector with 28 sentence- and 33 word-level features has been designed. On the word-level correctly and mispronounced words are classified, on the sentence-level utterances are rated with 5 discrete marks. The features are evaluated on two databases with non-native adults’ and children’s speech, respectively. Up to 72 % class-wise-averaged recognition rate is achieved for 2 classes; the result of the 5-class problem can be interpreted as 80 % recognition rate.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., Nakamura, S.: Pronunciation Scoring and Extraction of Mispronounced Words for Non-Native Speech. In: Proc. Acoustical Society of Japan, pp. 141–142 (2004)
Cucchiarini, C., Strik, H., Boves, L.: Different Aspects of Expert Pronunciation Quality Ratings and their Relation to Scores Produced by Speech Recognition Algorithms. Speech Communication 30, 109–119 (2000)
D’Arcy, S.M., Wong, L.P., Russell, M.J.: Recognition of Read and Spontaneous Children’s Speech Using two New Corpora. In: Proc. ICSLP, Korea (2004)
Franco, H., Neumeyer, L., Digalakis, V., Ronen, O.: Combination of Machine Scores for Automatic Grading of Pronunciation Quality. Speech Communication 30, 121–130 (2000)
Gruhn, R., Cincarek, T., Nakamura, S.: A Multi-Accent Non-Native English Database. In: Proc. of the Acoustical Society of Japan (2004)
Minematsu, N.: Pronunciation Assessment Based upon Phonological Distortions Observed in Language Learners’ Utterances. In: Proc. ICSLP, Korea (2004)
Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic Scoring of Pronunciation Quality. Speech Communication 30, 83–93 (2000)
Stemmer, G., Hacker, C., Steidl, S., Nöth, E.: Acoustic Normalization of Children’s Speech. In: Proc. Eurospeech, Geneva, Switzerland, pp. 1313–1316 (2003)
Witt, S.M., Young, S.J.: Language Learning Based on Non-Native Speech Recognition. In: Proc. Eurospeech, Rhodes, Greece, pp. 633–636 (1997)
Witt, S.M., Young, S.J.: Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning. Speech Communication 30, 95–108 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hacker, C., Cincarek, T., Gruhn, R., Steidl, S., Nöth, E., Niemann, H. (2005). Pronunciation Feature Extraction. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition. DAGM 2005. Lecture Notes in Computer Science, vol 3663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550518_18
Download citation
DOI: https://doi.org/10.1007/11550518_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28703-2
Online ISBN: 978-3-540-31942-9
eBook Packages: Computer ScienceComputer Science (R0)