An Improvement of Prosodic Characteristics in Vietnamese Text to Speech System

  • Thanh Son PhanEmail author
  • Anh Tuan Dinh
  • Tat Thang Vu
  • Chi Mai Luong
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 244)


One important goal of TTS system is to generate natural-sounding synthesized voice. To meet the goal, a variety of tasks are performed to model the prosodic aspects of TTS voice. The task being discussed here is POS and Intonation tagging. The paper examines the effects of POS and Intonation information on the naturalness of a hidden Markov model (HMM) based speech when other resources are not available. It is discovered that, when a limited feature set is used for HMM context labels, the POS and Intonation tags improve the naturalness of the synthesized voice.


Hide Markov Model Natural Speech Prosodic Characteristic Male Voice Pitch Accent 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Yamagishi, J., Ogata, K., Nakano, Y., Isogai, J., Kobayashi, T.: HSMM-Based Model adaptation algorithms for Average-Voice-Based speech synthesis. In: ICASSP 2006, pp. 77–80 (2006)Google Scholar
  2. 2.
    Tokuda, K., Yoshimura, T., Masuko, T., Kobayashi, T., Kitamura, T.: Speech parameter generation algorithms for HMM-based speech synthesis. In: Proc. ICASSP 2000, pp. 1315–1318 (June 2000)Google Scholar
  3. 3.
    Mixdorff, H., Nguyen, H.B., Fujisaki, H., Luong, C.M.: Quantitative Analysis and Synthesis of Syllabic Tones in Vietnamese. In: Proc. EUROSPEECH, Geneva, pp. 177–180 (2003)Google Scholar
  4. 4.
    Le, P.N., Ambikairajah, E., Choi, E.H.C.: Improvement of Vietnamese Tone Classification using FM and MFCC Features. In: Computing and Communication Technologies RIVF 2009, pp. 01–04 (2009)Google Scholar
  5. 5.
    Schlunz, G.I., Barnard, E., Van Huyssteen, G.B.: Part-of-speech effects on text-to-speech synthesis. In: 21st Annual Symposium of the Pattern Recognition Association of South Africa (PRASA), Stellenbosch, South Africa, November 22-23, pp. 257–262 (2010)Google Scholar
  6. 6.
    Phan, S.T., Vu, T.T., Duong, C.T., Luong, M.C.: A study in Vietnam-ese statistical parametric speech synthesis base on HMM. IJACST 2(1), 01–06 (2013)Google Scholar
  7. 7.
    Phan, S.T., Vu, T.T., Luong, M.C.: Extracting MFCC, F0 feature in Vietnamese HMM-based speech synthesis. International Journal of Electronics and Computer Science Engineering 2(1), 46–52 (2013)Google Scholar
  8. 8.
    Lê, T.-H., Nguyen, A.-V., Truong, H.V., Van Bui, H., Lê, D.: A Study on Vietnamese Prosody. In: Nguyen, N.T., Trawiński, B., Jung, J.J. (eds.) New Challenges for Intelligent Information and Database Systems. SCI, vol. 351, pp. 63–73. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  9. 9.
    Vu, T.T., Luong, M.C., Nakamura, S.: An HMM-based Vietnamese Speech Synthesis System. In: Proc. Oriental COCOSDA, pp. 116–121 (2009)Google Scholar
  10. 10.
    Doan, T.T.: Vietnamese Acoustic, Vietnamese National Editions, 2nd edn. (2003)Google Scholar
  11. 11.
    Vu, T.T., Nguyen, D.T., Luong, M.C., Hosom, J.P.: Vietnamese large vocabulary continuous speech recognition. In: Proc. INTERSPEECH, pp. 1689–1692 (2005)Google Scholar
  12. 12.
    Department of Computer Science, Nagoya Institute of Technology: Speech Signal Processing Toolkit, SPTK 3.6. Reference manual, Japan (December 2003), (updated December 25, 2012)

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Thanh Son Phan
    • 1
    Email author
  • Anh Tuan Dinh
    • 2
  • Tat Thang Vu
    • 2
  • Chi Mai Luong
    • 2
  1. 1.Faculty of Information TechnologyLe Qui Don Technical UniversityHanoi CityVietnam
  2. 2.Institute of Information TechnologyVietnam Academy of Science and TechnologyHanoi CityVietnam

Personalised recommendations