Abstract
Text-to-speech (TTS) systems are widely studied applications in Computer Science. It is more popular among the languages which has rich set of resources such as English and not as rigorously taken up in under resourced languages such as Nepali. Nevertheless, it has wider scope of application in different areas including telephony, e-learning and telecommunication.
The underresourced languages have trouble in developing the natural sounding TTS system. This is primarily because of the linguistic resources involved in the system. The preparation of such linguistic resources is costly, time consuming and requires the involvement of linguists/experts. The general trend in this research domain is to develop natural sounding TTS out of limited resources available. Nepali, being an underresourced language has very few linguistic resources available for developing TTS system.
In this work, we modified the existing TTS system [9] by adding computational units to process the input and output, we call them post and pre processing modules. We also made the system available to the public through the desktop application and plugin for the Firefox by pruning and adding phonetic rules and normalization rules.
We evaluated the existing and modified TTS systems via the qualitative evaluation techniques where 30 users were asked to provide their evaluation of the systems being based on the parameters- intelligibility and naturalness. Our results have shown that there has been an overall improvement of 6% in terms of naturalness and intelligibility, whereas the result of comprehension and diagnostic rhyme test is increased by 12% and 10% respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
REST stands for Representational State Transfer. (It is sometimes spelled “ReST”.) It relies on a stateless, client-server, cacheable communications protocol – and in virtually all cases, the HTTP protocol is used. REST is an architecture style for designing networked applications.
References
Cha, J.S., Lim, D.K., Shin, Y.N.: Design and implementation of a voice based navigation for visually impaired persons. Int. J. Bio-Sci. Bio-Technol. 5(3) (2013)
Dutoit, T.: An Introduction to Text-to-Speech Synthesis. Kluwer Academic Publishers, Dordrecht (1997). pp. 13, 14, 63, 72, 179, 196
Dutoit, T.: A Short Introduction to Text-to-Speech Synthesis. TTS Research Team, TCTS Lab (1999)
FestivalEngine (2014). http://www.cstr.ed.ac.uk/projects/festival/. Accessed 18 May 2014
Festvox (2014). http://festvox.org/. Accessed 10 June 2014
Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-1996) (1996)
ITU-T: Series P: telephone transmission quality - methods for objective and subjective assessment of quality (1996)
Jurafsky, D., Martin, J.H.: Speech and Language Processing, 2nd edn. Pearson Education, London (2009)
Nepali-TTS: Full manual of Nepali TTS (2008)
Nepali-TTS: http://bhashasanchar.org/textspeech_intro.php (2008). Accessed 18 Feb 2014
Taylor, P., Black, A.W., Caley, R.: The architecture of the festival speech synthesis system. Centre for Speech Technology Research (1998)
Sproat, R., Black, A.W., Chen, S., Kumar, S., Ostendorf, M., Richards, C.: Normalization of non-standard words. Comput. Speech Lang. 15(3), 287–333 (2001). http://dx.doi.org/10.1006/csla.2001.0169
Wang, W.Y., Georgila, K.: Automatic detection of unnatural word-level segments in unit-selection speech synthesis. In: IEEE ASRU (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Ghimire, R.R., Bal, B.K. (2017). Enhancing the Quality of Nepali Text-to-Speech Systems. In: Kravets, A., Shcherbakov, M., Kultsova, M., Groumpos, P. (eds) Creativity in Intelligent Technologies and Data Science. CIT&DS 2017. Communications in Computer and Information Science, vol 754. Springer, Cham. https://doi.org/10.1007/978-3-319-65551-2_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-65551-2_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65550-5
Online ISBN: 978-3-319-65551-2
eBook Packages: Computer ScienceComputer Science (R0)