Skip to main content

Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus

  • Conference paper
  • First Online:
Advances in Speech and Language Technologies for Iberian Languages (IberSPEECH 2016)

Abstract

This work presents an analysis over the set of results derived from the goodness of pronunciation (GOP) algorithm for the evaluation of pronunciation at phoneme level over the SAMPLE corpus of non native speech. This corpus includes several recordings of uttered sentences by distinct speakers that have been rated in terms of quality by a group of linguists. The utterances have been automatically rated with the GOP algorithm. The phoneme dependence is discussed to suggest the normalization of intermediate results that could enhance the metrics performance. As result, new scoring proposals are presented which are based on computing the log-likelihood values obtained from the GOP algorithm and the application of a set of rules. These new scores show to correlate with the human rates better than the original GOP metric.

We would like to thank Ministerio de Economía y Competitividad y Fondos FEDER project key: TIN2014-59852-R Videojuegos Sociales para la Asistencia y Mejora de la Pronunciación de la Lengua Española, and Junta de Castilla y Leon project key: VA145U14 Evaluación Automática de la Pronunciación del Español Como Lengua Extranjera para Hablantes Japoneses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Witt, S.M., Young, S.J.: Phone-level pronunciation scoring and assessment for interactive language learning. Speech Commun. 30, 95–108 (2000)

    Article  Google Scholar 

  2. Witt, S.M.: Automatic error detection in pronunciation training: where we are and where we need to go. In: Proceedings of IS ADEPT, vol. 6 (2012)

    Google Scholar 

  3. Eskenazi, M.: An overview of spoken language technology for education. Speech Commun. 51, 832–844 (2009)

    Article  Google Scholar 

  4. van Doremalen, J., Cucchiarini, C., Strik, H.: Automatic pronunciation error detection in non-native speech: the case of vowel errors in dutch. J. Acoust. Soc. Am. 134, 1336–1347 (2013)

    Article  Google Scholar 

  5. Neri, A., Cucchiarini, C., Strik, W.: Automatic speech recognition for second language learning: how and why it actually works. In: Proceedings of ICPhS, pp. 1157–1160 (2003)

    Google Scholar 

  6. Garrido, J.M., Escudero, D., Aguilar, L., Cardeñoso, V., Rodero, E., de-la Mota, C., González, C., Vivaracho, C., Rustullet, S., Larrea, O., Laplaza, Y., Vizcaíno, F., Estebas, E., Cabrera, M., Bonafonte, A.: Glissando: a corpus for multidisciplinary prosodic studies in Spanish and Catalan. Lang. Resour. Eval. 47, 945–971 (2013)

    Article  Google Scholar 

  7. Escudero-Mancebo, D., González-Ferreras, C., Cardeñoso Payo, V.: Assessment of non-native spoken Spanish using quantitative scores and perceptual evaluation. In: Chair, N.C.C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), Reykjavik, Iceland, European Language Resources Association (ELRA), pp. 3967–3972 (2014)

    Google Scholar 

  8. Moreno, A., Poch, D., Bonafonte, A., Lleida, E., Llisterri, J., Marino, J.B., Nadeu, C.: Albayzin speech database: design of the phonetic corpus. In: Third European Conference on Speech Communication and Technology (1993)

    Google Scholar 

  9. Strik, H., Truong, K., De Wet, F., Cucchiarini, C.: Comparing different approaches for automatic pronunciation error detection. Speech Commun. 51, 845–852 (2009)

    Article  Google Scholar 

  10. Neumeyer, L., Franco, H., Digalakis, V., Weintraub, M.: Automatic scoring of pronunciation quality. Speech Commun. 30, 83–93 (2000)

    Article  Google Scholar 

  11. Franco, H., Neumeyer, L., Kim, Y., Ronen, O.: Automatic pronunciation scoring for language instruction. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP-97, vol. 2, pp. 1471–1474. IEEE (1997)

    Google Scholar 

  12. Kim, Y., Franco, H., Neumeyer, L.: Automatic pronunciation scoring of specific phone segments for language instruction. In: Eurospeech (1997)

    Google Scholar 

  13. Witt, S.M., Young, S.J., et al.: Language learning based on non-native speech recognition. In: Eurospeech (1997)

    Google Scholar 

  14. Witt, S.M.: Use of Speech Recognition in Computer-Assisted Language Learning. University of Cambridge, Cambridge (1999)

    Google Scholar 

  15. Mak, B., Siu, M., Ng, M., Tam, Y.C., Chan, Y.C., Chan, K.W., Leung, K.Y., Ho, S., Chong, F.H., Wong, J., et al.: Plaser: pronunciation learning via automatic speech recognition. In: Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing, vol. 2, pp. 23–29. Association for Computational Linguistics (2003)

    Google Scholar 

  16. Fontan, L., Pellegrini, T., Olcoz, J., Abad, A.: Predicting disordered speech comprehensibility from goodness of pronunciation scores. In: Workshop on Speech and Language Processing for Assistive Technologies (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Escudero Mancebo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Álvarez Álvarez, V., Escudero Mancebo, D., González Ferreras, C., Cardeñoso Payo, V. (2016). Evaluating Different Non-native Pronunciation Scoring Metrics with the Japanese Speakers of the SAMPLE Corpus. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49169-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49168-4

  • Online ISBN: 978-3-319-49169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics