Abstract
In this paper a new voice conversion algorithm will be presented, which transforms the utterance of a source speaker into the utterance of a target speaker or into the utterance of a new unknown speaker. Presented voice conversion algorithm is based on spectral speech analysis, spectral envelope warping, spectrum interpolation and parametrical high quality IIR or FIR cepstral speech synthesis. Several approaches to frequency warping of the speech spectrum are compared, e.g. linear frequency transformation, piecewise linear frequency modification and nonlinear frequency low-pass to low-pass transformation. Prosodic transformation i.e. fundamental frequency, time and intensity scale modifications are not mentioned.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Moulines, E., Sagisaka, Y. (eds.): Voice Conversion: State of the Art and Perspectives. Special issue of Speech Communication 16(2) (February 1995)
Kain, A.B.: High Resolution Voice Transformation. Ph.D. thesis, Oregon Graduate Institute of Science and Technology (October 2001)
Přibilová, A.: Speech Spectrum Envelope Modification. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 30–37 (2003)
Vondra, M.: Voice Transformation in Parametric Speech Synthesis. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 35–37 (2003)
Nemšák, S.: Pitch Shifting and Voice Transformation Using PSOLA. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 38–41 (2003)
Přibilová, A., Vích, R.: Non-Linear Frequency Scale Mapping for Voice Conversion. In: Proc. of the 14th International Czech-Slovak Scientific Conference Radioelektronika 2004, Bratislava, Slovak Republic, April 27-28, pp. 100–103 (2004)
Vondra, M., Smékal, Z.: Composite Cepstral Models for TTS Synthesis. In: Vích, R. (ed.) Speech Processing, Proc. of the 11th Czech-German Workshop on Speech Processing, Prague, September 17-19, pp. 76–78 (2001)
Vích, R.: Cepstrales Sprachmodell, Kettenbrüche und Anregungsanpassung in der Sprachsynthese. Wissenschaftliche Zeitschrift der Technischen Universität Dresden 49(4/5), 116–121 (2000)
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, Englewood Cliffs (1989)
Arslan, L.M.: Speaker Transformation Algorithm using Segmental Codebooks (STASC). Speech Communication 28, 211–229 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vondra, M., Vích, R. (2005). Speech Identity Conversion. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_28
Download citation
DOI: https://doi.org/10.1007/11520153_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27441-4
Online ISBN: 978-3-540-31886-6
eBook Packages: Computer ScienceComputer Science (R0)