Skip to main content

Speech Identity Conversion

  • Conference paper
Nonlinear Speech Modeling and Applications (NN 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

Abstract

In this paper a new voice conversion algorithm will be presented, which transforms the utterance of a source speaker into the utterance of a target speaker or into the utterance of a new unknown speaker. Presented voice conversion algorithm is based on spectral speech analysis, spectral envelope warping, spectrum interpolation and parametrical high quality IIR or FIR cepstral speech synthesis. Several approaches to frequency warping of the speech spectrum are compared, e.g. linear frequency transformation, piecewise linear frequency modification and nonlinear frequency low-pass to low-pass transformation. Prosodic transformation i.e. fundamental frequency, time and intensity scale modifications are not mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Moulines, E., Sagisaka, Y. (eds.): Voice Conversion: State of the Art and Perspectives. Special issue of Speech Communication 16(2) (February 1995)

    Google Scholar 

  2. Kain, A.B.: High Resolution Voice Transformation. Ph.D. thesis, Oregon Graduate Institute of Science and Technology (October 2001)

    Google Scholar 

  3. Přibilová, A.: Speech Spectrum Envelope Modification. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 30–37 (2003)

    Google Scholar 

  4. Vondra, M.: Voice Transformation in Parametric Speech Synthesis. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 35–37 (2003)

    Google Scholar 

  5. Nemšák, S.: Pitch Shifting and Voice Transformation Using PSOLA. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 38–41 (2003)

    Google Scholar 

  6. Přibilová, A., Vích, R.: Non-Linear Frequency Scale Mapping for Voice Conversion. In: Proc. of the 14th International Czech-Slovak Scientific Conference Radioelektronika 2004, Bratislava, Slovak Republic, April 27-28, pp. 100–103 (2004)

    Google Scholar 

  7. Vondra, M., Smékal, Z.: Composite Cepstral Models for TTS Synthesis. In: Vích, R. (ed.) Speech Processing, Proc. of the 11th Czech-German Workshop on Speech Processing, Prague, September 17-19, pp. 76–78 (2001)

    Google Scholar 

  8. Vích, R.: Cepstrales Sprachmodell, Kettenbrüche und Anregungsanpassung in der Sprachsynthese. Wissenschaftliche Zeitschrift der Technischen Universität Dresden 49(4/5), 116–121 (2000)

    Google Scholar 

  9. Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, Englewood Cliffs (1989)

    MATH  Google Scholar 

  10. Arslan, L.M.: Speaker Transformation Algorithm using Segmental Codebooks (STASC). Speech Communication 28, 211–229 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vondra, M., Vích, R. (2005). Speech Identity Conversion. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_28

Download citation

  • DOI: https://doi.org/10.1007/11520153_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27441-4

  • Online ISBN: 978-3-540-31886-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics