Abstract
This paper describes an enhanced system for more efficient voice conversion. A weighted LMSE (Least Mean Squared Error) criterion is adopted, instead of conventional LMSE, for the spectral conversion function training. In addition, a short-term pitch contour mapping algorithm together with a new residual codebook formed from pitch contour is presented. Informal listening tests prove that convincing voice conversion is achieved while maintaining high speech quality. Evaluations by objective tests also show that the proposed system reduces speaker individual discrimination compared with the baseline system in LPC based analysis/synthesis framework.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Moulines, E., et al.: Voice conversion: state of the art and perspectives. Elsevier 16(2), 125–126 (1995)
Kuwabara, H., Sagisaka, Y.: Acoustic characteristics of speaker individuality: control and conversion. Speech communication 16(2), 165–173 (1995)
Abe, M., et al.: Voice conversion through vector quantization. In: Proceedings of ICASSP, pp. 655–658 (1988)
Stylianou, Y., Cappe, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE trans. in Speech & Audio processing 6, 131–142 (1998)
Kain, A., Macon, M.: Spectral voice conversion for text-to-speech synthesis. In: Proceedings of ICASSP, vol. 1, pp. 285–288 (1998)
Hui, Y., Steve, Y.: Perceptually weighted linear transformation for voice conversion. In: Eurospeech, pp. 2409–2412 (2003)
Kain, A., Macon, M.: Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. In: Proceedings of ICASSP, pp. 813–816 (2001)
Lawson, C.L., Hanson, R.J.: Solving Least Squares Problem. Prentice-Hall International, Inc., Englewood Cliffs
Kain, A., Stylianou, Y.: Stochastic modeling of spectral adjustment for high quality pitch modification. In: Proceedings of ICASSP, pp. 949–952 (2000)
Chang, E., Shi, Y., Zhou, J., Huang, C.: Speech lab in a box: a mandarin speech toolbox to jumpstart speech related research. In: Eurospeech, pp. 2799–2802 (2001)
Rabiner, L., Juang, B.-H.: Fundamentals of Speech Recognition, ch. 4. Prentice-Hall, Inc., Englewood Cliffs
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, J., Sun, J., Dai, B. (2005). Voice Conversion Based on Weighted Least Squares Estimation Criterion and Residual Prediction from Pitch Contour. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_42
Download citation
DOI: https://doi.org/10.1007/11573548_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29621-8
Online ISBN: 978-3-540-32273-3
eBook Packages: Computer ScienceComputer Science (R0)