Skip to main content

Emotional Speech Synthesis Based on Improved Codebook Mapping Voice Conversion

  • Conference paper
Affective Computing and Intelligent Interaction (ACII 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3784))

Abstract

This paper presents a spectral transformation method for emotional speech synthesis based on voice conversion framework. Three emotions are studied, including anger, happiness and sadness. For the sake of high naturalness, superior speech quality and emotion expressiveness, our original STASC system is modified by introducing a new feature selection strategy and hierarchical codebook mapping procedure. Our result shows that the LSF coefficients at low frequency carry more emotion-relative information, and therefore only these coefficients are converted. Listening tests prove that the proposed method can achieve a satisfactory balance between emotional expression and speech quality of converted speech signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Murry, I.R., et al.: Towards the Simulation of Emotion in Synthetic Speech: A Review of the Literature of Human Vocal Emotion. J. of ASA 93(2), 1097–1108 (1993)

    Google Scholar 

  2. Iida, A., et al.: A Speech Synthesis System with Emotion for Assisting Communication. In: Proc. ICSA Workshop on Speech And Emotion, pp. 167–177 (2000)

    Google Scholar 

  3. Iida, A., Campbell, N.: A corpus-based speech synthesis system with emotion. Speech Communication 40 (2003)

    Google Scholar 

  4. Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice Conversion through vector quantization. In: Proceedings of ICASSP 1988, pp. 655–658 (1988)

    Google Scholar 

  5. Shuang, Z.W., Wang, Z.X., Ling, Z.H., Wang, R.H.: A Novel Voice Conversion System based on Codebook Mapping with Phonome-tied Weighting. In: ISCSLP 2004, pp. 1197–1200 (2004)

    Google Scholar 

  6. Maeda, N., Hideki, B., Kajita, S.: Speaker conversion through NoN-Linear frequency warping of STRAIGHT spectrum. In: Proc. of Eurospeech 1999, pp. 827–830 (1999)

    Google Scholar 

  7. Toda, T., Saruwatari, H.: Voice conversion algorithm based on Gaussian Mixture Model with dynamic frequency warping of STRAIGHT spectrum. In: Proc. Of ICASSP, 2001, pp. 841–944 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, YP., Ling, ZH., Wang, RH. (2005). Emotional Speech Synthesis Based on Improved Codebook Mapping Voice Conversion. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_48

Download citation

  • DOI: https://doi.org/10.1007/11573548_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29621-8

  • Online ISBN: 978-3-540-32273-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics