Speech Identity Conversion

Vondra, Martin; Vích, Robert

doi:10.1007/11520153_28

Martin Vondra²² &
Robert Vích²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3445))

Included in the following conference series:

International School on Neural Networks, Initiated by IIASS and EMFCSC

1148 Accesses
3 Citations

Abstract

In this paper a new voice conversion algorithm will be presented, which transforms the utterance of a source speaker into the utterance of a target speaker or into the utterance of a new unknown speaker. Presented voice conversion algorithm is based on spectral speech analysis, spectral envelope warping, spectrum interpolation and parametrical high quality IIR or FIR cepstral speech synthesis. Several approaches to frequency warping of the speech spectrum are compared, e.g. linear frequency transformation, piecewise linear frequency modification and nonlinear frequency low-pass to low-pass transformation. Prosodic transformation i.e. fundamental frequency, time and intensity scale modifications are not mentioned.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Moulines, E., Sagisaka, Y. (eds.): Voice Conversion: State of the Art and Perspectives. Special issue of Speech Communication 16(2) (February 1995)
Google Scholar
Kain, A.B.: High Resolution Voice Transformation. Ph.D. thesis, Oregon Graduate Institute of Science and Technology (October 2001)
Google Scholar
Přibilová, A.: Speech Spectrum Envelope Modification. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 30–37 (2003)
Google Scholar
Vondra, M.: Voice Transformation in Parametric Speech Synthesis. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 35–37 (2003)
Google Scholar
Nemšák, S.: Pitch Shifting and Voice Transformation Using PSOLA. In: Vích, R. (ed.) Proc. of the 13th Czech-German Workshop on Speech Processing, Prague, September 15-17, pp. 38–41 (2003)
Google Scholar
Přibilová, A., Vích, R.: Non-Linear Frequency Scale Mapping for Voice Conversion. In: Proc. of the 14th International Czech-Slovak Scientific Conference Radioelektronika 2004, Bratislava, Slovak Republic, April 27-28, pp. 100–103 (2004)
Google Scholar
Vondra, M., Smékal, Z.: Composite Cepstral Models for TTS Synthesis. In: Vích, R. (ed.) Speech Processing, Proc. of the 11th Czech-German Workshop on Speech Processing, Prague, September 17-19, pp. 76–78 (2001)
Google Scholar
Vích, R.: Cepstrales Sprachmodell, Kettenbrüche und Anregungsanpassung in der Sprachsynthese. Wissenschaftliche Zeitschrift der Technischen Universität Dresden 49(4/5), 116–121 (2000)
Google Scholar
Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing. Prentice-Hall, Englewood Cliffs (1989)
MATH Google Scholar
Arslan, L.M.: Speaker Transformation Algorithm using Segmental Codebooks (STASC). Speech Communication 28, 211–229 (1999)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Brno University of Technology, Purkyňova 118, 61200, Brno, Czech Republic
Martin Vondra
Institute of Radio Engineering and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, 182 51, Prague 8, Czech Republic
Robert Vích

Authors

Martin Vondra
View author publications
You can also search for this author in PubMed Google Scholar
Robert Vích
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNRS LTCI/TSI Paris, 46 rue Barrault, 75634, Paris Cedex 13, France
Gérard Chollet
Department of Psychology, Second University of Naples, and IIASS, Via Pellegrino 19, 84019, Vietri sul Mare, SA, Italy
Anna Esposito
Escola Universitària Politècnica de Mataró, Universitat Politècnica de Catalunya, Barcelona, Spain
Marcos Faundez-Zanuy
Dipartimento di Fisica “E.R. Caianiello”, Università degli Studi di Salerno, Via S. Allende, 84081, Baronissi, SA, Italy
Maria Marinaro

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vondra, M., Vích, R. (2005). Speech Identity Conversion. In: Chollet, G., Esposito, A., Faundez-Zanuy, M., Marinaro, M. (eds) Nonlinear Speech Modeling and Applications. NN 2004. Lecture Notes in Computer Science(), vol 3445. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11520153_28

Download citation

DOI: https://doi.org/10.1007/11520153_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27441-4
Online ISBN: 978-3-540-31886-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics