Skip to main content

Standard Speaker Selection in Speech Synthesis for Mandarin Tone Learning

  • Conference paper
  • First Online:
  • 693 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 212))

Abstract

The teaching speech chosen to imitate plays a key role in learning Mandarin tone for L2 learners. It has been found that the synthesis teaching speech becomes more acceptable if it is alike the L2 learner’s own speech. Voice modification technology can be used to synthesize the teaching speech with both the standard speech of Chinese and the learner’s speech. At the same time different standard Chinese speakers will definitely affect the quality of the synthesis speech. The paper studies the selection method of the standard speech of Chinese in the teaching speech synthesis. The speakers’ features including MFCC, pitch, rhythm are compared and Gaussian Mixture Model is used to select the most appropriate Chinese speaker. The perceptual experimental results show that the modification with the Chinese speech which is similar to the learner’s speech in MFCC gets the best teaching speech both in phonetic and tonal quality.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Tang M, Wang C, Seneff S (2001) Voice transformations: from: speech synthesis to mammalian vocalizations. Aalborg, Denmark Eurospeech 2001

    Google Scholar 

  2. Probst K, Ke Y, Eskenazi M (2002) Enhancing foreign language tutors-in search of the golden speaker. Speech Commun 37(3–4):161–173

    Google Scholar 

  3. Peabody M, Seneff S (2006) Towards automatic tone correction in nonnative mandarin. Chin Spoken Lang Process 2006:602–613

    Article  Google Scholar 

  4. Felps D, Bortfeldb H, Gutierrez-Osuna R (2009) Foreign accent conversion in computer assisted pronunciation training. Speech Commun 51(10):920–932

    Article  Google Scholar 

  5. Wang R, Lu J (2011) Investigation of golden speakers for second language learners from imitation preference perspective by voice modification. Speech Commun 53(2):175–184

    Article  Google Scholar 

  6. Lin H, Wang Q (2007) Mandarin rhythm: an acoustic study. J Chin Linguist Comput 17(3):127–140

    Google Scholar 

  7. Ramus F, Nespor M, Mehler J (1999) Correlates of linguistic rhythm in the speech signal. Cognition 72:1–28

    Article  Google Scholar 

  8. Grabe E, Low EL (2002) Durational variability in speech and the rhythm class hypothesis. In: Gussenhoven C, Warner N (eds) Laboratory phonology 7. Moutonde Gruyter, New York, pp 515–546

    Google Scholar 

  9. Cao W, Zhang J (2009) The establishment of a CAPL inter-chinese corpus and its labeling. In: Proceedings Of NCMMSC (in Chinese)

    Google Scholar 

  10. Cao W, Wang D, Zhang J, Xiong Z (2010) Developing a Chinese L2 speech database of Japanese learners with narrow-phonetic labels for computer assisted pronunciation training. Int Speech 2010 1922–1925

    Google Scholar 

  11. Boersma P, Weenink D (2010) Praat: doing phonetics by computer. Version 5.1. 44

    Google Scholar 

Download references

Acknowledgments

The research underlying this paper was supported by National Nature Science Foundation of China (61175019) and Youth Independent Research Program Projects of Beijing Language and Culture University (10JBT01).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanlu Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xie, Y., Zhang, J., Shi, S. (2013). Standard Speaker Selection in Speech Synthesis for Mandarin Tone Learning. In: Lu, W., Cai, G., Liu, W., Xing, W. (eds) Proceedings of the 2012 International Conference on Information Technology and Software Engineering. Lecture Notes in Electrical Engineering, vol 212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34531-9_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34531-9_39

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34530-2

  • Online ISBN: 978-3-642-34531-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics