Vocal Manipulation Based on Pitch Transcription and Its Application to Interactive Entertainment for Karaoke

  • Kota Nakano
  • Masanori Morise
  • Takanobu Nishiura
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6851)


A real-time vocal manipulation system is described for improving karaoke. Karaoke is an interactive entertainment system where users sing along with recorded music, and it is used all over the world. However, although the users should sing with accurate pitch, it is difficult for the tone-deaf people to sing with accurate pitch. In this paper, a real-time vocal manipulation system is proposed to help tone-deaf people. The system consists of vocoder-based voice synthesis method that can synthesize the voiced sound with fundamental frequency (pitch) and spectral envelope (timbre). Vocal manipulation is achieved based on pitch transcription by replacing the pitch of a tone-deaf person with that of a professional singer. Subjective evaluation is carried out to verify the effectiveness of the proposed system. The results suggested that the proposed system can manipulate vocal sounds in real time.


Vocal manipulation Vocoder Interactive entertainment Karaoke 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kenmochi, H., Ohshita, H.: VOCALOID - commercial singing synthesizer based on sample concatenation. In: Proc. Interspeech 2007, pp. 4009–4010 (2007)Google Scholar
  2. 2.
    Hidebrand, H.A.: Pitch detection and intonation correction apparatus and method. U.S. Patent 5,973252 (1999)Google Scholar
  3. 3.
    Dudley, H.: Remaking speech. J. Acoust. Soc. Am. 11(2), 169–177 (1939)CrossRefGoogle Scholar
  4. 4.
    Nakano, K., Morise, M., Nishiura, T.: Proposal of a new vocoder for real-time synthesis of speech signal with high quality. In: Proc. ICA 2010, PaperID:332 (2010)Google Scholar
  5. 5.
    Cano, P., Loscos, A., Bonada, J., de Boer, M., Serra, X.: Voice morphing system for impersonating in karaoke applications. In: Proc. ICMC, pp.109–112 (2000)Google Scholar
  6. 6.
    Morise, M., Onishi, M., Kawahara, H., Katayose, H.: v.morish 2009: A morphing-based singing design interface for vocal melodies. In: Natkin, S., Dupire, J. (eds.) ICEC 2009. LNCS, vol. 5709, pp. 185–190. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  7. 7.
    Kawahara, H., Nisimura, R., Irino, T., Morise, M., Takahashi, T., Banno, H.: Temporally variable multi-aspect auditory morphing enabling extrapolation without objective and perceptual breakdown. In: Proc. ICASSP 2009, pp. 3905–3908 (2009)Google Scholar
  8. 8.
    Kawahara, H., Nishikara, R., Irino, T., Morise, M., Takahashi, T., Banno, H.: Higi-quality and light-weight voice transformation enabling extrapolation without perceptual and objective breakdown. In: Proc. ICASSP 2010, pp. 4818–4821 (2010)Google Scholar
  9. 9.
    Uchimura, Y., Banno, H., Itakura, F., Kawahara, H.: Study of manipulation method of voice quality based on the vocal tract area function. In: Proc. Interspeech 2008, pp.1084–1087 (2008)Google Scholar
  10. 10.
    Oppenheim, A.V.: A speech analysis-synthesis system based on homomorphic filtering. J. Acoust. Soc. Am. 45(2), 458–465 (1969)CrossRefGoogle Scholar
  11. 11.
    Atal, B.S., Hanauer, M.R.: Speech Analysis and Synthesis by Linear Predictive of the Speech Wave. J. Acoust. Soc. Am. 50(2), 637–655 (1971)CrossRefGoogle Scholar
  12. 12.
    Kawahara, H., Morise, M., Banno, H., Takahashi, T., Irino, T.: TANDEM-STRAIGHT: A temporally stable power spectral representation for periodic signals and applications to interference-free spectrum, f0, and aperiodicity estimation. In: Proc. ICASSP 2008, pp. 3933–3936 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Kota Nakano
    • 1
  • Masanori Morise
    • 2
  • Takanobu Nishiura
    • 2
  1. 1.Graduate School of Science and EngineeringRitsumeikan UniversityKusatsuJapan
  2. 2.College of Information and ScienceRitsumeikan UniversityKusatsuJapan

Personalised recommendations