Duration Modeling Using Multi-model Based on Positional Information
This paper proposes prediction of syllable durations by developing multi-models using positional information. The proposed multi-model consists of four models used for predicting the durations of syllables. Among them, one of the models is used for predicting the durations of syllables present in mono-syllabic words, and the remaining three models are meant for predicting the durations of syllables present at initial, middle and final positions of polysyllabic words. In this study, (i) linguistic constraints represented by positional, contextual and phonological features and (ii) production constraints represented by articulatory features are used for predicting the duration patterns. Feed-forward Neural Networks (FFNN) are used for developing the duration models using above mentioned features. It was found, that the prediction accuracy is improved using multi-models compared to single duration model.
KeywordsMulti-models Duration prediction Prediction accuracy Feed-forward neural networks Linguistic and Production constraints
- 1.Reddy, V.R., Rao, K.S.: Better human computer interaction by enhancing the quality of text-to-speech synthesis. In: Proc. Int. Conf. Intelligent Human Computer Interaction (IHCI), IIT Kharagpur, India, pp. 1–6 (December 2012)Google Scholar
- 5.Yegnanarayana, B.: Artificial Neural Networks. Prentice-Hall, New Delhi (1999)Google Scholar
- 6.Reddy, V.R., Rao, K.S.: Intonation Modeling using FFNN for Syllable based Bengali Text To Speech Synthesis. In: Proc. Int. Conf. Computer and Communication Technology, MNNIT, Allahabad, pp. 334–339 (2011)Google Scholar
- 10.Tamura, S., Tateishi, M.: Capabilities of a Four-Layered Feedforward Neural Network: Four Layers Versus Three. 8, 251–255 (1997)Google Scholar