Reduced Search Space Frame Alignment Based on Kullback-Leibler Divergence for Voice Conversion

Shahrebabaki, Abdoreza Sabzi; Amini, Jamal; Sheikhzadeh, Hamid; Ghorbandoost, Mostafa; Faraji, Neda

doi:10.1007/978-3-642-38847-7_11

Abdoreza Sabzi Shahrebabaki²¹,
Jamal Amini²¹,
Hamid Sheikhzadeh²¹,
Mostafa Ghorbandoost²¹ &
…
Neda Faraji²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7911))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

1061 Accesses

Abstract

A new text independent voice conversion based on Kullback-Leibler divergence (KLD) is proposed. This method only uses acoustic information and does not require any linguistic or phonetic information. The KLD is used to find reliable correspondence between the source and target GMM clusters and to reduce the search space for alignment of source and target frames. Subjective evaluation results show that the proposed method can achieve the same performance as parallel voice conversion methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 72.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Mouchtaris, A., Van der Spiegel, J., Mueller, P.: Nonparallel training for voice conversion based on a parameter adaptation approach. IEEE Trans. Audio, Speech and Lang. Process. 14(3), 952–963 (2006)
Article Google Scholar
Lee, C.H., Wu, C.H.: MAP-based adaptation for speech conversion using adaptation data selection and non-parallel training. In: Proc. Int. Conf. Spoken Lang. Process., pp. 2446–2449 (2006)
Google Scholar
Sündermann, D., Bonafonte, A., Ney, H., Höge, H.: A first step to- wards text-independent voice conversion. In: Proc. Int. Conf. Spoken Lang. Process., pp. 1173–1176 (2004)
Google Scholar
Ye, H., Young, S.: Voice conversion for unknown speakers. In: Proc. Int. Conf. Spoken Lang. Process., pp. 1161–1164 (2004)
Google Scholar
Duxans, H., Erro, D., Pérez, J., Diego, F., Bonafonte, A., Moreno, A.: Voice conversion of non-aligned data using unit selection. In: TC-STAR Workshop on Speech to Speech Translation (2006)
Google Scholar
Sündermann, D., Höge, H., Bonafonte, A., Ney, H., Black, A.W., Narayanan, S.: Text-independent voice conversion based on unit selection. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., vol. 1, pp. 81–84 (2006)
Google Scholar
Erro, D., Moreno, A., Bonafonte, A.: INCA Algorithm for Training Voice Conversion Systems From Nonparallel Corpora. IEEE Trans. Audio, Speech, and Lang. Process. 18(5), 944–953 (2010)
Article Google Scholar
Kullback, S., Leibler, R.A.: On Information and Sufficiency. Annals of Math. Statistics 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Kawahara, H., Masuda-Katsuse, I., de Cheveigné, A.: Restructuring speech representations using a pitch adaptive time-frequency smoothing and instantaneous frequency based f0 extraction: Possible role of a repetitive structure in sounds. Speech Commun. 27, 187–207 (1999)
Article Google Scholar
Chazan, D., Hoory, R., Cohen, G., Zibulski, M.: Speech reconstruction from Mel frequency cepstral coefficients and pitch frequency. In: Proc. ICASSP, pp. 1299–1302 (2000)
Google Scholar
Kain, A., Macon, M.W.: Spectral voice conversion for text-to-speech synthesis. In: Proc. ICASSP, Seattle, WA, pp. 285–288 (May 1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Multimedia Signal Processing Research Laboratory (MSPRL), Electrical Eng. Dept., Amirkabir University of Technology, Hafez Ave., Tehran, Iran
Abdoreza Sabzi Shahrebabaki, Jamal Amini, Hamid Sheikhzadeh, Mostafa Ghorbandoost & Neda Faraji

Authors

Abdoreza Sabzi Shahrebabaki
View author publications
You can also search for this author in PubMed Google Scholar
Jamal Amini
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Sheikhzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Mostafa Ghorbandoost
View author publications
You can also search for this author in PubMed Google Scholar
Neda Faraji
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

TCTS Lab, University of Mons, 31, Bouldevard Bolez, 7000, Mons, Belgium
Thomas Drugman
TCTS Lab, University of Mons, 31, Boulevard Dolez, 7000, Mons, Belgium
Thierry Dutoit

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shahrebabaki, A.S., Amini, J., Sheikhzadeh, H., Ghorbandoost, M., Faraji, N. (2013). Reduced Search Space Frame Alignment Based on Kullback-Leibler Divergence for Voice Conversion. In: Drugman, T., Dutoit, T. (eds) Advances in Nonlinear Speech Processing. NOLISP 2013. Lecture Notes in Computer Science(), vol 7911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38847-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-38847-7_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38846-0
Online ISBN: 978-3-642-38847-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics