Study on the Artificial Synthesis of Human Voice Using Radial Basis Function Networks

Naniwa, Yuuki; Kondo, Takaaki; Kamiyama, Kyohei; Kamata, Hiroyuki

doi:10.1007/978-4-431-54216-2_32

Yuuki Naniwa⁵,
Takaaki Kondo⁵,
Kyohei Kamiyama⁵ &
…
Hiroyuki Kamata⁶

Part of the book series: Proceedings in Information and Communications Technology ((PICT,volume 4))

2251 Accesses

Abstract

In this study, we introduce the method of reconstructing more natural synthetic voice by using radial basis function network (RBF) that is one of neural network that is suitable for function approximation problems and following and synthesizing vocal fluctuations. In the synthetic simulation of RBF, we have set the Gaussian function based on parameters and tried to reconstruct the vocal fluctuations. With respect to parameter estimation, we have adopted to nonlinear least-squares method for making much account of the nonlinearity of human voice. When we have reproduced the synthesized speech, we have tried to reconstruct the nonlinear fluctuations of amplitude by adding normal random number. We have made a comparison the real voice and the synthetic voice obtained from simulation. As a consequence, we have found that it was possible to synthesize the vocal fluctuations for a short time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tokuda, K.: Fundamentals of Speech Synthesis Based on HMM. IEICE 100(392), SP2000-74, 43–50 (2000)
Google Scholar
Kitani, M., Hara, T., Sawada, H.: Autonomous Voice Acquisition of a Talking Robot Based on Topological Structure Learning by Applying Dual-SOM. Transactions of the Japan Society of Mechanical Engineers Series C 77(775), 1062–1070 (2011)
Article Google Scholar
Kanda, H., Ogata, T., Takahashi, T., Komatani, K., Okuno, H.: Simulation of Babbling and Vowel Acquisition based on Vocal Imitation Model using Recurrent Neural Network. In: IPS 2009, March 10, pp. 2-133–2-134 (2009)
Google Scholar
Maeda, E., Arai, T., Saika, N.: Study of mechanical models of the human vocal tract having nasal cavity. IEICE 103(219), 1–5 (2003)
Google Scholar
Minematsu, N., Nishimura, T., Sakuraba, K.: Consideration on infants’ speech mimicking and their language acquisition based on the structural representation of speech
Google Scholar
Golder, E.R., Settle, J.G.: The Box-Muller Method for Generating Pseudo-Random Normal Deviates. Journal of the Royal Statistical Society. Series C, Page 19 of 12–20 (1976)
Google Scholar
Dai, S., Hirohku, T., Toyota, N.: The Lyapunov Spectrum and the chaotic property in speech sounds. IEICE Technical Report. Speech 99(576), 37–43 (2000)
Google Scholar
Ogawa, S., Ikeguchi, T., Matozaki, T., Aihara, K.: Time Series Analysis using Radial Basis Function Networks. IEICE Technical Report. Neurocomputing 95(505), 29–36 (1996)
Google Scholar
Suzuki, T., Nakagawa, M.: Fluctuashion of the vocal sound and its chaotic and fractal analyses. IEICE Technical Report. Nonliniea Problems 104(334) (2004)
Google Scholar
Koga, H., Nakagawa, K.: Chaotic Properties in Vocal Sound and Synthesis Model. IEICE Technical Report, NLP99-120 (November 1990)
Google Scholar
Wang, X., Niu, Y.: Adaptove synchronization of chaotic systems with nonlinearity inputs
Google Scholar
Hartley, H.O.: The Modified Gauss-Newton Method for the Fitting of Non-Linear Regression Functions by Least Squares. American Statistical Association and American Society for Quality 3(2), 269–280 (1961)
MathSciNet MATH Google Scholar
Watanabe, T.: Consideration of Prediction Accuracy of Chaos Time Series Prediction by RBFN. The Research Reports of Oyama Technical College 39, 107–111 (2007)
Google Scholar
Naniwa, Y., Kondo, T., Kamiyama, K., Kamata, H.: The exact reproduction in the voice signal of radial basis function network. IEICE Technical Report 110(387), 199–204 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Science and Technology, Meiji University, Kawasaki, Japan
Yuuki Naniwa, Takaaki Kondo & Kyohei Kamiyama
School of Science and Technology, Meiji University, Kawasaki, Japan
Hiroyuki Kamata

Authors

Yuuki Naniwa
View author publications
You can also search for this author in PubMed Google Scholar
Takaaki Kondo
View author publications
You can also search for this author in PubMed Google Scholar
Kyohei Kamiyama
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Kamata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Yonsei University, Korea
Jong-Hyun Kim
MyongJi University, Korea
Kangsun Lee
Ritsumeikan University, Japan
Satoshi Tanaka
Kookmin University, Korea
Soo-Hyun Park

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Naniwa, Y., Kondo, T., Kamiyama, K., Kamata, H. (2012). Study on the Artificial Synthesis of Human Voice Using Radial Basis Function Networks. In: Kim, JH., Lee, K., Tanaka, S., Park, SH. (eds) Advanced Methods, Techniques, and Applications in Modeling and Simulation. Proceedings in Information and Communications Technology, vol 4. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54216-2_32

Download citation

DOI: https://doi.org/10.1007/978-4-431-54216-2_32
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-54215-5
Online ISBN: 978-4-431-54216-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics