A Lyrics to Singing Voice Synthesis System with Variable Timbre

Li, Jinlong; Yang, Hongwu; Zhang, Weizhao; Cai, Lianhong

doi:10.1007/978-3-642-23220-6_23

Jinlong Li²,
Hongwu Yang²,
Weizhao Zhang² &
…
Lianhong Cai³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 225))

Included in the following conference series:

International Conference on Applied Informatics and Communication

2166 Accesses
1 Citations

Abstract

In this paper, we present a singing voice synthesis system, which can convert lyrics to singing voice. As the synthetic song’s timbre is too monotonous, a new singing voice morphing algorithm based on GMM (Gaussian Mixture Model) was presented accordingly. The MOS test shows that the average MOS score of synthesized song is above 3.3 before timbre conversion. The professional singer’s timbre can be added proportionally by changing the scale factor k in the system. The ABX test demonstrates that the accuracy can be up to 100% in the case of k=0 or k=1, and it can be higher than 64.5% in the case of 0<k<1. The experiments also show the mean of GMM has greater impact on a singer’s timbre than weight ratio and covariance.

This work is partially supported by the National Science Foundation of China (NSFC) under grant NO.60875015 and the Key Project of Chinese Ministry of Education under grant MO.208146.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tokuda, K., Zen, H., Black, A.W.: An HMM-based speech synthesis system applied to English. In: Proc. 2002 IEEE Workshop on Speech Synthesis, Santa Monica, CA, pp. 41–46 (2002)
Google Scholar
Zhou, S.-s., Chen, Q.-q., Wang, D.-d., et al.: Acorpus-based concatenative mandarin singing voice synthesis system. In: Proc. Seventh International Conference on Machine Learning and Cybernetics, Kunming, China, pp. 2695–2699 (July 2008)
Google Scholar
Gu, H.Y., Liau, H.L.: Mandarin Singing Voice Synthesis Using an HNM Based Scheme. In: Proc. International Congress on Image and Signal Processing, Sanya, China, pp. 347–351 (2008)
Google Scholar
Macon, M.W., Jensen-Link, L., Oliverio, J., et al.: A Singing Voice Synthesis System Based on Sinusoidal Modeling. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, pp. 435–438 (1997)
Google Scholar
Saitou, T., Goto, M., Unoki, M., et al.: Speech-to-sing synthesis: converting speaking voices to sing voices by controlling acoustic features unique to sing voices. In: Proc. 10th National Conference on Man-Machine Speech Communication, Lanzhou, China, pp. 477–482 (August 2009)
Google Scholar
Kawanami, H., Iwami, Y., Toda, T., et al.: GMM-based Voice Conversion Applied to Emotional Speech Synthesis. In: Proc. European Conference on Speech Communication and Technology, Geneva, Switzerland, pp. 2401–2404 (2003)
Google Scholar
Kawahara, H., Estill, J., Fujimura, O.: Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight. In: Proc. International Workshop on Models and Analysis of Vocal Emissions for Biomedical Application, Firentze Italy, pp. 13–15 (September 2001)
Google Scholar
Saitou, T., Goto, M., Unoki, M., et al.: Vocal Conversion from speaking voice to singing voice using STRAIGHT. In: Proc. Interspeech, Antwerp, Belgium, pp. 4005–4006 (2007)
Google Scholar
Saitou, T., Unokiand, M.M., Akagi: Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis. Speech Communication 46, 405–417 (2005)
Article Google Scholar
Lai, W.-h.: F0 Control Model for Mandarin Singing Voice Synthesis. In: Proc. 2th International Conference on Digital Telecommunications, San Jose, California, pp. 12–15 (2007)
Google Scholar
Cai, L.-h., Hou, j., Liu, r., et al.: HMM Parametric Singing Synthesis with Pitch Instruction. In: Proc. 8th National Conference of Multimedia Technology, Xi’an, China, pp. 219–225 (2009)
Google Scholar
Chen, Y., Chu, M., Chang, E., et al.: Voice conversion with smoothed GMM and map adaptation. In: Proc. Eurospeech, Geneva, Switzerland, pp. 2413–2416 (2003)
Google Scholar
Cano, P., Loscos, A., Bonada, J., et al.: Voice Morphing System for Impersonating in Karaoke Applications. In: Proc. International Computer Music Conference, Rio de Janeiro, Brazil, pp. 109–112 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Physics and Electronic Engineering, Northwest Normal University, lanzhou, Gansu Province, China
Jinlong Li, Hongwu Yang & Weizhao Zhang
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Lianhong Cai

Authors

Jinlong Li
View author publications
You can also search for this author in PubMed Google Scholar
Hongwu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Weizhao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Lianhong Cai
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shenzhen University, Nanhai Ave. 3688, 518060, Shenzhen, Guangdong, China
Dehuai Zeng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Yang, H., Zhang, W., Cai, L. (2011). A Lyrics to Singing Voice Synthesis System with Variable Timbre. In: Zeng, D. (eds) Applied Informatics and Communication. ICAIC 2011. Communications in Computer and Information Science, vol 225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23220-6_23

Download citation

DOI: https://doi.org/10.1007/978-3-642-23220-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23219-0
Online ISBN: 978-3-642-23220-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics