Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus

Álvarez Álvarez, Vandria; Escudero Mancebo, David; González Ferreras, César; Cardeñoso Payo, Valentín

doi:10.1007/978-3-319-49169-1_20

Vandria Álvarez Álvarez²¹,
David Escudero Mancebo²¹,
César González Ferreras²¹ &
…
Valentín Cardeñoso Payo²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10077))

Included in the following conference series:

International Conference on Advances in Speech and Language Technologies for Iberian Languages

665 Accesses

Abstract

This work presents an analysis over the set of results derived from the goodness of pronunciation (GOP) algorithm for the evaluation of pronunciation at phoneme level over the SAMPLE corpus of non native speech. This corpus includes several recordings of uttered sentences by distinct speakers that have been rated in terms of quality by a group of linguists. The utterances have been automatically rated with the GOP algorithm. The phoneme dependence is discussed to suggest the normalization of intermediate results that could enhance the metrics performance. As result, new scoring proposals are presented which are based on computing the log-likelihood values obtained from the GOP algorithm and the application of a set of rules. These new scores show to correlate with the human rates better than the original GOP metric.

We would like to thank Ministerio de Economía y Competitividad y Fondos FEDER project key: TIN2014-59852-R Videojuegos Sociales para la Asistencia y Mejora de la Pronunciación de la Lengua Española, and Junta de Castilla y Leon project key: VA145U14 Evaluación Automática de la Pronunciación del Español Como Lengua Extranjera para Hablantes Japoneses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Witt, S.M., Young, S.J.: Phone-level pronunciation scoring and assessment for interactive language learning. Speech Commun. 30, 95–108 (2000)
Article Google Scholar
Witt, S.M.: Automatic error detection in pronunciation training: where we are and where we need to go. In: Proceedings of IS ADEPT, vol. 6 (2012)
Google Scholar
Eskenazi, M.: An overview of spoken language technology for education. Speech Commun. 51, 832–844 (2009)
Article Google Scholar
van Doremalen, J., Cucchiarini, C., Strik, H.: Automatic pronunciation error detection in non-native speech: the case of vowel errors in dutch. J. Acoust. Soc. Am. 134, 1336–1347 (2013)
Article Google Scholar
Neri, A., Cucchiarini, C., Strik, W.: Automatic speech recognition for second language learning: how and why it actually works. In: Proceedings of ICPhS, pp. 1157–1160 (2003)
Google Scholar
Garrido, J.M., Escudero, D., Aguilar, L., Cardeñoso, V., Rodero, E., de-la Mota, C., González, C., Vivaracho, C., Rustullet, S., Larrea, O., Laplaza, Y., Vizcaíno, F., Estebas, E., Cabrera, M., Bonafonte, A.: Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan. Lang. Resour. Eval. 47, 945–971 (2013)
Article Google Scholar
Escudero-Mancebo, D., González-Ferreras, C., Cardeñoso Payo, V.: Assessment of non-native spoken Spanish using quantitative scores and perceptual evaluation. In: Chair, N.C.C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, European Language Resources Association (ELRA), pp. 3967–3972 (2014)
Google Scholar
Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Marino, J.B., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: Third European Conference on Speech Communication and Technology (1993)
Google Scholar
Strik, H., Truong, K., De Wet, F., Cucchiarini, C.: Comparing different approaches for automatic pronunciation error detection. Speech Commun. 51, 845–852 (2009)
Article Google Scholar
Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic scoring of pronunciation quality. Speech Commun. 30, 83–93 (2000)
Article Google Scholar
Franco, H., Neumeyer, L., Kim, Y., Ronen, O.: Automatic pronunciation scoring for language instruction. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-97, vol. 2, pp. 1471–1474. IEEE (1997)
Google Scholar
Kim, Y., Franco, H., Neumeyer, L.: Automatic pronunciation scoring of specific phone segments for language instruction. In: Eurospeech (1997)
Google Scholar
Witt, S.M., Young, S.J., et al.: Language learning based on non-native speech recognition. In: Eurospeech (1997)
Google Scholar
Witt, S.M.: Use of Speech Recognition in Computer-Assisted Language Learning. University of Cambridge, Cambridge (1999)
Google Scholar
Mak, B., Siu, M., Ng, M., Tam, Y.C., Chan, Y.C., Chan, K.W., Leung, K.Y., Ho, S., Chong, F.H., Wong, J., et al.: Plaser: pronunciation learning via automatic speech recognition. In: Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing, vol. 2, pp. 23–29. Association for Computational Linguistics (2003)
Google Scholar
Fontan, L., Pellegrini, T., Olcoz, J., Abad, A.: Predicting disordered speech comprehensibility from goodness of pronunciation scores. In: Workshop on Speech and Language Processing for Assistive Technologies (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Universidad de Valladolid, Valladolid, Spain
Vandria Álvarez Álvarez, David Escudero Mancebo, César González Ferreras & Valentín Cardeñoso Payo

Authors

Vandria Álvarez Álvarez
View author publications
You can also search for this author in PubMed Google Scholar
David Escudero Mancebo
View author publications
You can also search for this author in PubMed Google Scholar
César González Ferreras
View author publications
You can also search for this author in PubMed Google Scholar
Valentín Cardeñoso Payo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to David Escudero Mancebo .

Editor information

Editors and Affiliations

INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Alberto Abad
I3A/University of Zaragoza, Zaragoza, Spain
Alfonso Ortega
DETI/IEETA, University of Aveiro, Aveiro, Portugal
António Teixeira
AtlantTIC Research Center, Universidad de Vigo, Vigo, Spain
Carmen García Mateo
Universitat Politècnica de València, Valencia, Spain
Carlos D. Martínez Hinarejos
University of Coimbra, Coimbra, Portugal
Fernando Perdigão
INESC-ID/ISCTE-IUL, Lisbon, Portugal
Fernando Batista
INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Nuno Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Álvarez Álvarez, V., Escudero Mancebo, D., González Ferreras, C., Cardeñoso Payo, V. (2016). Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-49169-1_20
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49168-4
Online ISBN: 978-3-319-49169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics