Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing Text-to-Speech System in Hindi

Sreenivasa Rao, K.; Maity, Sudhamay; Taru, Amol; Koolagudi, Shashidhar G.

doi:10.1007/978-3-642-11164-8_86

K. Sreenivasa Rao²¹,
Sudhamay Maity²¹,
Amol Taru²¹ &
…
Shashidhar G. Koolagudi²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5909))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1430 Accesses

Abstract

In this paper we propose a new method for unit selection in developing text-to-speech (TTS) system for Hindi. In the proposed method, syllables are used as basic units for concatenation. Linguistic, positional and contextual features derived from the input text are used at the first level in the unit selection process. The unit selection process is further refined by incorporating the prosodic and spectral characteristics at the utterance and syllable levels. The speech corpora considered for this task is the broadcast Hindi news read by a male speaker. Synthesized speech from the developed TTS system using multi-level unit selection criterion is evaluated using listening tests. From the evaluation results, it is observed that the synthesized speech quality has improved by refining the unit selection process using spectral and prosodic features.

Download to read the full chapter text

Chapter PDF

Investigating Signal Correlation as Continuity Metric in a Syllable Based Unit Selection Synthesis System

Unit Selection Using Acoustic Supra-Segmental Cues to Improve Prosody

Development of Concatenative Syllable-Based Text to Speech Synthesis System for Tamil

Keywords

References

Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Atlanta, Georgia, USA, May. 1996, vol. 1, pp. 373–376 (1996)
Google Scholar
Yegnanarayana, B., Murthy, H.A., Sundar, R., Ramachandran, V.R., Kumar, A.S.M., Alwar, N., Rajendran, S.: Development of text-to-speech system for Indian languages. In: Proc. Int. Conf. Knowledge Based Computer Systems, Pune, India, December 1990, pp. 467–476 (1990)
Google Scholar
Krishna, N.S., Murthy, H.A.: A new prosodic phrasing model for Indian language Telugu. In: INTERSPEECH 2004 - ICSLP, October 2004, vol. 1, pp. 793–796 (2004)
Google Scholar
Thomas, S., Rao, M.N., Murthy, H.A., Ramalingam, C.S.: Natural sounding TTs based on syllable-like units. In: Proc. 14th European Signal Processing Conference, Florence, Italy (September 2006)
Google Scholar
Kishore, S.P., Kumar, R., Sangal, R.: A data-driven synthesis approach for indian languages using syllable as basic unit. In: Int. Conf. Natural Language Processing, Mumbai, India (December 2002)
Google Scholar
Sen, A., Vijaya, K.S.: Indian accent text to speech system for web browsing, Sadhana (2002)
Google Scholar
Sreekanth, M., Ramakrishnan, A.G.: Festival based maiden TTS system for Tamil language. In: Proc. 3rd Language and Technology Conf., Poznan, Poland, October 2007, pp. 187–191 (2007)
Google Scholar
Basu, A., Sen, D., Sen, S., Chakrabarthy, S.: An Indian language speech syn- thesizer: Techniques and its applications. In: National Systems Conference, IIT Kharagpur, Kharagpur, India (2003)
Google Scholar
Rao, K.S., Yegnanarayana, B.: Modeling durations of syllables using neural networks. Computer Speech and Language 21, 282–295 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology, Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
K. Sreenivasa Rao, Sudhamay Maity, Amol Taru & Shashidhar G. Koolagudi

Authors

K. Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
Sudhamay Maity
View author publications
You can also search for this author in PubMed Google Scholar
Amol Taru
View author publications
You can also search for this author in PubMed Google Scholar
Shashidhar G. Koolagudi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Electrical Engineering Department, Indian Institute of Technology Delhi, 110016, New Delhi, India
Santanu Chaudhury
Center for Soft Computing Research, Indian Statistical Institute, 700 108, Kolkata, India
Sushmita Mitra
Center for Soft Computing Research, Indian Statistical Institute,
C. A. Murthy
Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore, INDIA
P. S. Sastry
Center for Soft Computing Research, Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, 700 108, Kolkata, India
Sankar K. Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sreenivasa Rao, K., Maity, S., Taru, A., Koolagudi, S.G. (2009). Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing Text-to-Speech System in Hindi. In: Chaudhury, S., Mitra, S., Murthy, C.A., Sastry, P.S., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2009. Lecture Notes in Computer Science, vol 5909. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11164-8_86

Download citation

DOI: https://doi.org/10.1007/978-3-642-11164-8_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11163-1
Online ISBN: 978-3-642-11164-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing Text-to-Speech System in Hindi

Abstract

Chapter PDF

Similar content being viewed by others

Investigating Signal Correlation as Continuity Metric in a Syllable Based Unit Selection Synthesis System

Unit Selection Using Acoustic Supra-Segmental Cues to Improve Prosody

Development of Concatenative Syllable-Based Text to Speech Synthesis System for Tamil

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Unit Selection Using Linguistic, Prosodic and Spectral Distance for Developing Text-to-Speech System in Hindi

Abstract

Chapter PDF

Similar content being viewed by others

Investigating Signal Correlation as Continuity Metric in a Syllable Based Unit Selection Synthesis System

Unit Selection Using Acoustic Supra-Segmental Cues to Improve Prosody

Development of Concatenative Syllable-Based Text to Speech Synthesis System for Tamil

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation