Skip to main content

A Lyrics to Singing Voice Synthesis System with Variable Timbre

  • Conference paper

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 225))

Abstract

In this paper, we present a singing voice synthesis system, which can convert lyrics to singing voice. As the synthetic song’s timbre is too monotonous, a new singing voice morphing algorithm based on GMM (Gaussian Mixture Model) was presented accordingly. The MOS test shows that the average MOS score of synthesized song is above 3.3 before timbre conversion. The professional singer’s timbre can be added proportionally by changing the scale factor k in the system. The ABX test demonstrates that the accuracy can be up to 100% in the case of k=0 or k=1, and it can be higher than 64.5% in the case of 0<k<1. The experiments also show the mean of GMM has greater impact on a singer’s timbre than weight ratio and covariance.

This work is partially supported by the National Science Foundation of China (NSFC) under grant NO.60875015 and the Key Project of Chinese Ministry of Education under grant MO.208146.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tokuda, K., Zen, H., Black, A.W.: An HMM-based speech synthesis system applied to English. In: Proc. 2002 IEEE Workshop on Speech Synthesis, Santa Monica, CA, pp. 41–46 (2002)

    Google Scholar 

  2. Zhou, S.-s., Chen, Q.-q., Wang, D.-d., et al.: Acorpus-based concatenative mandarin singing voice synthesis system. In: Proc. Seventh International Conference on Machine Learning and Cybernetics, Kunming, China, pp. 2695–2699 (July 2008)

    Google Scholar 

  3. Gu, H.Y., Liau, H.L.: Mandarin Singing Voice Synthesis Using an HNM Based Scheme. In: Proc. International Congress on Image and Signal Processing, Sanya, China, pp. 347–351 (2008)

    Google Scholar 

  4. Macon, M.W., Jensen-Link, L., Oliverio, J., et al.: A Singing Voice Synthesis System Based on Sinusoidal Modeling. In: Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, pp. 435–438 (1997)

    Google Scholar 

  5. Saitou, T., Goto, M., Unoki, M., et al.: Speech-to-sing synthesis: converting speaking voices to sing voices by controlling acoustic features unique to sing voices. In: Proc. 10th National Conference on Man-Machine Speech Communication, Lanzhou, China, pp. 477–482 (August 2009)

    Google Scholar 

  6. Kawanami, H., Iwami, Y., Toda, T., et al.: GMM-based Voice Conversion Applied to Emotional Speech Synthesis. In: Proc. European Conference on Speech Communication and Technology, Geneva, Switzerland, pp. 2401–2404 (2003)

    Google Scholar 

  7. Kawahara, H., Estill, J., Fujimura, O.: Aperiodicity extraction and control using mixed mode excitation and group delay manipulation for a high quality speech analysis, modification and synthesis system straight. In: Proc. International Workshop on Models and Analysis of Vocal Emissions for Biomedical Application, Firentze Italy, pp. 13–15 (September 2001)

    Google Scholar 

  8. Saitou, T., Goto, M., Unoki, M., et al.: Vocal Conversion from speaking voice to singing voice using STRAIGHT. In: Proc. Interspeech, Antwerp, Belgium, pp. 4005–4006 (2007)

    Google Scholar 

  9. Saitou, T., Unokiand, M.M., Akagi: Development of an F0 control model based on F0 dynamic characteristics for singing-voice synthesis. Speech Communication 46, 405–417 (2005)

    Article  Google Scholar 

  10. Lai, W.-h.: F0 Control Model for Mandarin Singing Voice Synthesis. In: Proc. 2th International Conference on Digital Telecommunications, San Jose, California, pp. 12–15 (2007)

    Google Scholar 

  11. Cai, L.-h., Hou, j., Liu, r., et al.: HMM Parametric Singing Synthesis with Pitch Instruction. In: Proc. 8th National Conference of Multimedia Technology, Xi’an, China, pp. 219–225 (2009)

    Google Scholar 

  12. Chen, Y., Chu, M., Chang, E., et al.: Voice conversion with smoothed GMM and map adaptation. In: Proc. Eurospeech, Geneva, Switzerland, pp. 2413–2416 (2003)

    Google Scholar 

  13. Cano, P., Loscos, A., Bonada, J., et al.: Voice Morphing System for Impersonating in Karaoke Applications. In: Proc. International Computer Music Conference, Rio de Janeiro, Brazil, pp. 109–112 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, J., Yang, H., Zhang, W., Cai, L. (2011). A Lyrics to Singing Voice Synthesis System with Variable Timbre. In: Zeng, D. (eds) Applied Informatics and Communication. ICAIC 2011. Communications in Computer and Information Science, vol 225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23220-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23220-6_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23219-0

  • Online ISBN: 978-3-642-23220-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics