Pronunciation Feature Extraction

Hacker, Christian; Cincarek, Tobias; Gruhn, Rainer; Steidl, Stefan; Nöth, Elmar; Niemann, Heinrich

doi:10.1007/11550518_18

Pronunciation Feature Extraction

Christian Hacker¹⁹,
Tobias Cincarek²⁰,
Rainer Gruhn²⁰,
Stefan Steidl¹⁹,
Elmar Nöth¹⁹ &
…
Heinrich Niemann¹⁹

Conference paper

1904 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 3663))

Abstract

Automatic pronunciation scoring makes novel applications for computer assisted language learning possible. In this paper we concentrate on the feature extraction. A relatively large feature vector with 28 sentence- and 33 word-level features has been designed. On the word-level correctly and mispronounced words are classified, on the sentence-level utterances are rated with 5 discrete marks. The features are evaluated on two databases with non-native adults’ and children’s speech, respectively. Up to 72 % class-wise-averaged recognition rate is achieved for 2 classes; the result of the 5-class problem can be interpreted as 80 % recognition rate.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cincarek, T., Gruhn, R., Hacker, C., Nöth, E., Nakamura, S.: Pronunciation Scoring and Extraction of Mispronounced Words for Non-Native Speech. In: Proc. Acoustical Society of Japan, pp. 141–142 (2004)
Google Scholar
Cucchiarini, C., Strik, H., Boves, L.: Different Aspects of Expert Pronunciation Quality Ratings and their Relation to Scores Produced by Speech Recognition Algorithms. Speech Communication 30, 109–119 (2000)
Article Google Scholar
D’Arcy, S.M., Wong, L.P., Russell, M.J.: Recognition of Read and Spontaneous Children’s Speech Using two New Corpora. In: Proc. ICSLP, Korea (2004)
Google Scholar
Franco, H., Neumeyer, L., Digalakis, V., Ronen, O.: Combination of Machine Scores for Automatic Grading of Pronunciation Quality. Speech Communication 30, 121–130 (2000)
Article Google Scholar
Gruhn, R., Cincarek, T., Nakamura, S.: A Multi-Accent Non-Native English Database. In: Proc. of the Acoustical Society of Japan (2004)
Google Scholar
Minematsu, N.: Pronunciation Assessment Based upon Phonological Distortions Observed in Language Learners’ Utterances. In: Proc. ICSLP, Korea (2004)
Google Scholar
Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic Scoring of Pronunciation Quality. Speech Communication 30, 83–93 (2000)
Article Google Scholar
Stemmer, G., Hacker, C., Steidl, S., Nöth, E.: Acoustic Normalization of Children’s Speech. In: Proc. Eurospeech, Geneva, Switzerland, pp. 1313–1316 (2003)
Google Scholar
Witt, S.M., Young, S.J.: Language Learning Based on Non-Native Speech Recognition. In: Proc. Eurospeech, Rhodes, Greece, pp. 633–636 (1997)
Google Scholar
Witt, S.M., Young, S.J.: Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning. Speech Communication 30, 95–108 (2000)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Mustererkennung, Universität Erlangen-Nürnberg, Martensstraße 3, D-91058, Erlangen, Germany
Christian Hacker, Stefan Steidl, Elmar Nöth & Heinrich Niemann
ATR Spoken Language Translation Res. Labs., Kyoto, Japan
Tobias Cincarek & Rainer Gruhn

Authors

Christian Hacker
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Cincarek
View author publications
You can also search for this author in PubMed Google Scholar
Rainer Gruhn
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Steidl
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar
Heinrich Niemann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

PRIP, Vienna University of Technology, Austria
Walter G. Kropatsch
Vienna University of Technology, Vienna, Austria
Robert Sablatnig
Pattern Recognition and Image Processing Group, Institute of Computer-Aided Automation, Vienna University of Technology, Favoritenstraße 9/1832, A-1040, Vienna, Austria
Allan Hanbury

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hacker, C., Cincarek, T., Gruhn, R., Steidl, S., Nöth, E., Niemann, H. (2005). Pronunciation Feature Extraction. In: Kropatsch, W.G., Sablatnig, R., Hanbury, A. (eds) Pattern Recognition. DAGM 2005. Lecture Notes in Computer Science, vol 3663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11550518_18

Download citation

DOI: https://doi.org/10.1007/11550518_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28703-2
Online ISBN: 978-3-540-31942-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics