Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

Abstract

This paper deals with speech modification using a cepstral vocoder with the intent to change the emotional content of speech. The cepstral vocoder contains the analysis and synthesis stages. The analysis stage performs the estimation of speech parameters – vocal tract properties, fundamental frequency, intensity, etc. In this parametric domain the segmental and suprasegmental speech modifications may be performed and than the speech can be reconstructed using the parametric source-filter cepstral model. We use the described cepstral vocoder and speech parameter modifications as a tool for research in emotional speech modeling and synthesis. The paper is focused rather on the description of this system and its possibilities than to precise settings of parameter modifications for speech generation with given emotions. The system is still under development. Plans for future research are shortly summarized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cahn, J.E.: Generating Expression in Synthesized Speech. Master’s Thesis, MIT (1989)

    Google Scholar 

  2. Murray, I.R., Arnott, J.L.: Implementation and testing of a system for producing emotions-by-rule in synthetic speech. Speech Communication 16, 369–390 (1995)

    Article  Google Scholar 

  3. Burkhardt, F., Sendlmeier, W.F.: Verification of Acoustic Correlates of Emotional Speech using Formant-Synthesis. In: Proc. of ISCA Workshop on Speech & Emotions, pp. 151–156 (2000)

    Google Scholar 

  4. Burkhardt, F.: Emofilt: the Simulation of Emotional Speech by Prosody-Transformation. In: Proc. of Interspeech 2005, Lisbon, Portugal, September 4-8, pp. 509–512 (2005)

    Google Scholar 

  5. Türk, O., Schröder, M.: A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis. In: Proc. of Interspeech 2008, Brisbane, Australia, September 23-26, pp. 2282–2285 (2008)

    Google Scholar 

  6. Inanoglu, Z., Young, S.: Data-driven emotion conversion in spoken English. Speech Communication 5, 268–283 (2009)

    Article  Google Scholar 

  7. Stylianou, Y., Cappe, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE Trans. on Speech and Audio Proc. 6(2), 131–142 (1998)

    Article  Google Scholar 

  8. Kain, A., Macon, M.: Spectral voice conversion for text-to-speech synthesis. In: Proc. of the IEEE ICASSP, vol. 1, pp. 285–288 (1998)

    Google Scholar 

  9. Vích, R., Vondra, M.: Voice Conversion. In: Proc. of the Summer School DATASTAT 2006, pp. 281–293. Masaryk University, Brno, Czech Rep. (2007)

    Google Scholar 

  10. Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, et al. (eds.) EuroConference BIOSIGNAL, Proc. of 15th Biennial International Conference, June 2000, pp. 77–82. Brno University of Technology, Brno (2000)

    Google Scholar 

  11. Arslan, L.M.: Speaker Transformation Algorithm using Segmental Codebooks (STASC). Speech Communication 28, 211–226 (1999)

    Article  Google Scholar 

  12. Eimas, P.D., Miller, J.L.: Perspectives on the study of speech. Lawrence Erlbaum Associates, Inc., Hillsdale (1981)

    Google Scholar 

  13. Vích, R., Vondra, M.: Voice Conversion Based on Nonlinear Spectrum Transformation. In: Proceedings of 14th Czech-German Workshop, Prague, September 13-15, pp. 53–60 (2004)

    Google Scholar 

  14. Airas, M., Alku, P.: Emotions in Short Vowel Segments: Effects of the Glottal Flow as Reflected by the Normalized Amplitude Quotient. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 13–24. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 189–212 (2003)

    Article  MATH  Google Scholar 

  16. Inanoglu, Z.: Transforming Pitch in a Voice Conversion Framework. Master’s Thesis, St. Edmunds College, University of Cambridge (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Vondra, M., Vích, R. (2010). Speech Emotion Modification Using a Cepstral Vocoder. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12397-9_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12396-2

  • Online ISBN: 978-3-642-12397-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics