Speech Emotion Modification Using a Cepstral Vocoder

Vondra, Martin; Vích, Robert

doi:10.1007/978-3-642-12397-9_23

Martin Vondra²⁰ &
Robert Vích²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5967))

2302 Accesses
1 Citations

Abstract

This paper deals with speech modification using a cepstral vocoder with the intent to change the emotional content of speech. The cepstral vocoder contains the analysis and synthesis stages. The analysis stage performs the estimation of speech parameters – vocal tract properties, fundamental frequency, intensity, etc. In this parametric domain the segmental and suprasegmental speech modifications may be performed and than the speech can be reconstructed using the parametric source-filter cepstral model. We use the described cepstral vocoder and speech parameter modifications as a tool for research in emotional speech modeling and synthesis. The paper is focused rather on the description of this system and its possibilities than to precise settings of parameter modifications for speech generation with given emotions. The system is still under development. Plans for future research are shortly summarized.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cahn, J.E.: Generating Expression in Synthesized Speech. Master’s Thesis, MIT (1989)
Google Scholar
Murray, I.R., Arnott, J.L.: Implementation and testing of a system for producing emotions-by-rule in synthetic speech. Speech Communication 16, 369–390 (1995)
Article Google Scholar
Burkhardt, F., Sendlmeier, W.F.: Verification of Acoustic Correlates of Emotional Speech using Formant-Synthesis. In: Proc. of ISCA Workshop on Speech & Emotions, pp. 151–156 (2000)
Google Scholar
Burkhardt, F.: Emofilt: the Simulation of Emotional Speech by Prosody-Transformation. In: Proc. of Interspeech 2005, Lisbon, Portugal, September 4-8, pp. 509–512 (2005)
Google Scholar
Türk, O., Schröder, M.: A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis. In: Proc. of Interspeech 2008, Brisbane, Australia, September 23-26, pp. 2282–2285 (2008)
Google Scholar
Inanoglu, Z., Young, S.: Data-driven emotion conversion in spoken English. Speech Communication 5, 268–283 (2009)
Article Google Scholar
Stylianou, Y., Cappe, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE Trans. on Speech and Audio Proc. 6(2), 131–142 (1998)
Article Google Scholar
Kain, A., Macon, M.: Spectral voice conversion for text-to-speech synthesis. In: Proc. of the IEEE ICASSP, vol. 1, pp. 285–288 (1998)
Google Scholar
Vích, R., Vondra, M.: Voice Conversion. In: Proc. of the Summer School DATASTAT 2006, pp. 281–293. Masaryk University, Brno, Czech Rep. (2007)
Google Scholar
Vích, R.: Cepstral Speech Model, Padé Approximation, Excitation and Gain Matching in Cepstral Speech Synthesis. In: Jan, et al. (eds.) EuroConference BIOSIGNAL, Proc. of 15th Biennial International Conference, June 2000, pp. 77–82. Brno University of Technology, Brno (2000)
Google Scholar
Arslan, L.M.: Speaker Transformation Algorithm using Segmental Codebooks (STASC). Speech Communication 28, 211–226 (1999)
Article Google Scholar
Eimas, P.D., Miller, J.L.: Perspectives on the study of speech. Lawrence Erlbaum Associates, Inc., Hillsdale (1981)
Google Scholar
Vích, R., Vondra, M.: Voice Conversion Based on Nonlinear Spectrum Transformation. In: Proceedings of 14th Czech-German Workshop, Prague, September 13-15, pp. 53–60 (2004)
Google Scholar
Airas, M., Alku, P.: Emotions in Short Vowel Segments: Effects of the Glottal Flow as Reflected by the Normalized Amplitude Quotient. In: André, E., Dybkjær, L., Minker, W., Heisterkamp, P. (eds.) ADS 2004. LNCS (LNAI), vol. 3068, pp. 13–24. Springer, Heidelberg (2004)
Chapter Google Scholar
Gobl, C., Chasaide, A.N.: The role of voice quality in communicating emotion, mood and attitude. Speech Communication 40, 189–212 (2003)
Article MATH Google Scholar
Inanoglu, Z.: Transforming Pitch in a Voice Conversion Framework. Master’s Thesis, St. Edmunds College, University of Cambridge (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, CZ 18251, Prague 8, Czech Republic
Martin Vondra & Robert Vích

Authors

Martin Vondra
View author publications
You can also search for this author in PubMed Google Scholar
Robert Vích
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Second University of Naples, and IIASS, Via Pellegrino, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Centre for Language and Communication Studies, Trinity College, The University of Dublin, Dublin 2, Ireland
Nick Campbell & Carl Vogel &
Department of Computing Science & Mathematics, University of Stirling, FK9 4LA, Stirling, Scotland, UK
Amir Hussain
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Anton Nijholt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vondra, M., Vích, R. (2010). Speech Emotion Modification Using a Cepstral Vocoder. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds) Development of Multimodal Interfaces: Active Listening and Synchrony. Lecture Notes in Computer Science, vol 5967. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12397-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-12397-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12396-2
Online ISBN: 978-3-642-12397-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics