Skip to main content

Study on the Artificial Synthesis of Human Voice Using Radial Basis Function Networks

  • Conference paper
Book cover Advanced Methods, Techniques, and Applications in Modeling and Simulation

Part of the book series: Proceedings in Information and Communications Technology ((PICT,volume 4))

  • 2251 Accesses

Abstract

In this study, we introduce the method of reconstructing more natural synthetic voice by using radial basis function network (RBF) that is one of neural network that is suitable for function approximation problems and following and synthesizing vocal fluctuations. In the synthetic simulation of RBF, we have set the Gaussian function based on parameters and tried to reconstruct the vocal fluctuations. With respect to parameter estimation, we have adopted to nonlinear least-squares method for making much account of the nonlinearity of human voice. When we have reproduced the synthesized speech, we have tried to reconstruct the nonlinear fluctuations of amplitude by adding normal random number. We have made a comparison the real voice and the synthetic voice obtained from simulation. As a consequence, we have found that it was possible to synthesize the vocal fluctuations for a short time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tokuda, K.: Fundamentals of Speech Synthesis Based on HMM. IEICE 100(392), SP2000-74, 43–50 (2000)

    Google Scholar 

  2. Kitani, M., Hara, T., Sawada, H.: Autonomous Voice Acquisition of a Talking Robot Based on Topological Structure Learning by Applying Dual-SOM. Transactions of the Japan Society of Mechanical Engineers Series C 77(775), 1062–1070 (2011)

    Article  Google Scholar 

  3. Kanda, H., Ogata, T., Takahashi, T., Komatani, K., Okuno, H.: Simulation of Babbling and Vowel Acquisition based on Vocal Imitation Model using Recurrent Neural Network. In: IPS 2009, March 10, pp. 2-133–2-134 (2009)

    Google Scholar 

  4. Maeda, E., Arai, T., Saika, N.: Study of mechanical models of the human vocal tract having nasal cavity. IEICE 103(219), 1–5 (2003)

    Google Scholar 

  5. Minematsu, N., Nishimura, T., Sakuraba, K.: Consideration on infants’ speech mimicking and their language acquisition based on the structural representation of speech

    Google Scholar 

  6. Golder, E.R., Settle, J.G.: The Box-Muller Method for Generating Pseudo-Random Normal Deviates. Journal of the Royal Statistical Society. Series C, Page 19 of 12–20 (1976)

    Google Scholar 

  7. Dai, S., Hirohku, T., Toyota, N.: The Lyapunov Spectrum and the chaotic property in speech sounds. IEICE Technical Report. Speech 99(576), 37–43 (2000)

    Google Scholar 

  8. Ogawa, S., Ikeguchi, T., Matozaki, T., Aihara, K.: Time Series Analysis using Radial Basis Function Networks. IEICE Technical Report. Neurocomputing 95(505), 29–36 (1996)

    Google Scholar 

  9. Suzuki, T., Nakagawa, M.: Fluctuashion of the vocal sound and its chaotic and fractal analyses. IEICE Technical Report. Nonliniea Problems 104(334) (2004)

    Google Scholar 

  10. Koga, H., Nakagawa, K.: Chaotic Properties in Vocal Sound and Synthesis Model. IEICE Technical Report, NLP99-120 (November 1990)

    Google Scholar 

  11. Wang, X., Niu, Y.: Adaptove synchronization of chaotic systems with nonlinearity inputs

    Google Scholar 

  12. Hartley, H.O.: The Modified Gauss-Newton Method for the Fitting of Non-Linear Regression Functions by Least Squares. American Statistical Association and American Society for Quality 3(2), 269–280 (1961)

    MathSciNet  MATH  Google Scholar 

  13. Watanabe, T.: Consideration of Prediction Accuracy of Chaos Time Series Prediction by RBFN. The Research Reports of Oyama Technical College 39, 107–111 (2007)

    Google Scholar 

  14. Naniwa, Y., Kondo, T., Kamiyama, K., Kamata, H.: The exact reproduction in the voice signal of radial basis function network. IEICE Technical Report 110(387), 199–204 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Tokyo

About this paper

Cite this paper

Naniwa, Y., Kondo, T., Kamiyama, K., Kamata, H. (2012). Study on the Artificial Synthesis of Human Voice Using Radial Basis Function Networks. In: Kim, JH., Lee, K., Tanaka, S., Park, SH. (eds) Advanced Methods, Techniques, and Applications in Modeling and Simulation. Proceedings in Information and Communications Technology, vol 4. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54216-2_32

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-54216-2_32

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-54215-5

  • Online ISBN: 978-4-431-54216-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics