Skip to main content

An Improved Speech Synthesis Algorithm with Post filter Parameters Based on Deep Neural Network

  • Conference paper
  • First Online:
  • 2346 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 517))

Abstract

Statistical parameters speech synthesis typically relies on context-dependent Hidden Markov Model (HMM) that is based on decision tree clustering. However, the shortcomings of clustering decision tree, restricted to a feature rigid subdivision model space, results in smooth speech parameters generated from HMM. In this paper, Deep Neural Network (DNN) is put forward to replace clustering decision tree, and we propose a post filter-parameter-based speech synthesis improvement algorithm. This method enhances the formant region of synthesized speech spectrum by selecting the most optimized filter parameter according to the flatness of spectrum. The experimental results show that DNN effectively can modify the deficiency of two smooth parameters. Furthermore, the improved post filter algorithm increases the naturalness of synthesized speech.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dahl, G.E., Yu, D., Deng, L., et al.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)

    Article  Google Scholar 

  2. Qian, Y., Fan, Y., Hu, W., et al.: On the training aspects of deep neural network (DNN) for parametric TTS synthesis. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3829–3833. IEEE (2014)

    Google Scholar 

  3. Ze, H., Senior, A., Schuster, M.: Statistical parametric speech synthesis using deep neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7962–7966. IEEE (2013)

    Google Scholar 

  4. Yoshimura, T., Tokuda, K., Masuko, T., et al.: Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis. Syst. Comput. Jpn. 36(12), 43–50 (2005)

    Article  Google Scholar 

  5. Takamichi, S., Toda, T., Neubig, G., et al.: A postfilter to modify the modulation spectrum in HMM-based speech synthesis. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 290–294. IEEE (2014)

    Google Scholar 

  6. Ling, Z.H., Wu, Y.J., Wang, Y.P., et al.: USTC system for Blizzard challenge 2006 an improved HMM-based speech synthesis method. In: Blizzard Challenge Workshop (2006)

    Google Scholar 

  7. Deng, L.: Analysis of Deep Learning. Publishing House of Electronics Industry (2016)

    Google Scholar 

  8. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  9. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Cogn. Model. 5(3), 1 (1988)

    MATH  Google Scholar 

  10. Krogh, A., Hertz, J.A.: A simple weight decay can improve generalization. In: NIPS, vol. 4, pp. 950–957 (1991)

    Google Scholar 

  11. Grancharov, V., Samuelsson, J., Kleijn, W.B.: Distortion measures for vector quantization of noisy spectrum. In: INTERSPEECH 2005 - Eurospeech, European Conference on Speech Communication and Technology, Lisbon, Portugal, September, DBLP, pp. 3173–3176 (2005)

    Google Scholar 

  12. Grancharov, V., Plasberg, J.H., Samuelsson, J., et al.: Generalized postfilter for speech quality enhancement. IEEE Trans. Audio Speech Lang. Process. 16(1), 57–64 (2008)

    Article  Google Scholar 

  13. Koishida, K., Tokuda, K., Kobayashi, T., et al.: CELP coding based on mel-cepstral analysis. In: International Conference on Acoustics, Speech, and Signal Processing, vol.1, 33–36. IEEE (1995)

    Google Scholar 

  14. Ge, Y.K.: Postfilter Parameter Adapted Speech Synthesis Modified Agorithim. Advance publish house (2015)

    Google Scholar 

  15. Kominek, J., Black, A.W.: The CMU Arctic speech databases. In: Fifth ISCA Workshop on Speech Synthesis (2004)

    Google Scholar 

  16. Fan, Y., Qian, Y., Soong, F.K., et al.: Multi-speaker modeling and speaker adaptation for DNN-based TTS synthesis. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4475–4479. IEEE (2015)

    Google Scholar 

Download references

Acknowledgements

This paper is supported by the URTP project of School of SME, the demonstration course project of Xidian University, and the Ministry of Education cooperation collaborative education project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hong Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dong, S., Li, C., Zhang, H. (2020). An Improved Speech Synthesis Algorithm with Post filter Parameters Based on Deep Neural Network. In: Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 517. Springer, Singapore. https://doi.org/10.1007/978-981-13-6508-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6508-9_30

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6507-2

  • Online ISBN: 978-981-13-6508-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics