Corpus Based Emotional Speech Synthesis in Hindi

Bhakat, Ravi Kalyan; Narendra, N. P.; Sreenivasa Rao, Krothapalli

doi:10.1007/978-3-642-45062-4_53

Ravi Kalyan Bhakat¹⁸,
N. P. Narendra¹⁸ &
Krothapalli Sreenivasa Rao¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8251))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1763 Accesses
4 Citations

Abstract

This paper explores a unit selection based concatenative approach towards emotional speech synthesis in Hindi. The emotions explored are sad and neutral. The Festival framework is used as the underlying Text-To-Speech (TTS) system. The various steps which are followed to create a new voice in Festival are described here. The developed TTS systems are evaluated by subjective evaluation tests. These tests indicate a significant improvement in the quality of synthesis after necessary prosody modifications. Finally, possible improvements which can be made on the systems are put forward.

Download to read the full chapter text

Chapter PDF

Emotional Speech Datasets for English Speech Synthesis Purpose: A Review

Emotional Prosodic Model Evaluation for Greek Expressive Text-to-Speech Synthesis

Developing a Thai emotional speech corpus from Lakorn (EMOLA)

Article 28 November 2018

Keywords

References

Murray, I.R., Arnott, J.L.: Implementation and testing of a system for producing emotion-by-rule in synthetic speech. Speech Communication 16(4), 369–390 (1995)
Article Google Scholar
Narendra, N.P., Rao, K.S., Ghosh, K., Vempada, R.R., Maity, S.: Development of syllable-based text to speech synthesis system in Bengali. International Journal of Speech Technology 14(3), 167–181 (2011)
Article Google Scholar
Iida, A., Campbell, N., Higuchi, F., Yasumura, M.: A corpus-based speech synthesis system with emotion. Speech Communication 40(12), 161–187 (2003)
Article MATH Google Scholar
Clark, A.J.R., Richmond, K., King, S.: Festival 2 - Build your own general purpose unit selection speech synthesiser. In: Proceedings of 5th ISCA Workshop on Speech Synthesis (2004)
Google Scholar
Black, A.W., Taylor, P., Caley, R.: The Festival Speech Synthesis System, System documentation, edn. 1.4, for Festival Version 1.4.3 (2002)
Google Scholar
Black, A.W., Lenzo, K.A.: Building Synthetic Voices. Language Technologies Institute, Carnegie Mellon University (2007)
Google Scholar
King, S., Black, A.W., Taylor, P., Caley, R., Clark, R.: Edinburgh Speech Tools Library, System Documentation, edn. 1.2, for 1.2.3. Centre for Speech Technology, University of Edinburgh (2003)
Google Scholar
Rabiner, L., Juang, B.H.: An introduction to hidden markov models. IEEE ASSP Magazine 3(1), 4–16 (1986)
Article Google Scholar
Narendra, N.P., Rao, K.S.: Syllable specific unit selection cost functions for text-to-speech synthesis. ACM Transactions on Speech and Language Processing 9(3), 5:1–5:24 (2012)
Google Scholar
Narendra, N.P., Rao, K.S.: Optimal weight tuning method for unit selection cost functions in syllable based text-to-speech synthesis. Applied Soft Computing 13(2), 773–781 (2013)
Article Google Scholar
Rao, K.S., Yegnanarayana, B.: Prosody modification using instants of significant excitation. IEEE Transactions on Audio, Speech and Language Processing 14(3) (May 2006)
Google Scholar
Rao, K.S., Prasanna, S.R.M., Yegnanarayana, B.: Determination of instants of significant excitation in speech using Hilbert envelope and group delay function. IEEE Signal Processing Letters 14(10) (October 2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Kharagpur, Kharagpur, 721302, West Bengal, India
Ravi Kalyan Bhakat, N. P. Narendra & Krothapalli Sreenivasa Rao

Authors

Ravi Kalyan Bhakat
View author publications
You can also search for this author in PubMed Google Scholar
N. P. Narendra
View author publications
You can also search for this author in PubMed Google Scholar
Krothapalli Sreenivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Machine Intelligence Unit, Indian Statistical Institute, 203, B. T. Road, 700108, Kolkata, India
Pradipta Maji , Ashish Ghosh , Kuntal Ghosh & Sankar K. Pal , , &
Department of Computer Science and Automation, Indian Institute of Science, 560012, Bangalore, India
M. Narasimha Murty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhakat, R.K., Narendra, N.P., Sreenivasa Rao, K. (2013). Corpus Based Emotional Speech Synthesis in Hindi. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2013. Lecture Notes in Computer Science, vol 8251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45062-4_53

Download citation

DOI: https://doi.org/10.1007/978-3-642-45062-4_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45061-7
Online ISBN: 978-3-642-45062-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Corpus Based Emotional Speech Synthesis in Hindi

Abstract

Chapter PDF

Similar content being viewed by others

Emotional Speech Datasets for English Speech Synthesis Purpose: A Review

Emotional Prosodic Model Evaluation for Greek Expressive Text-to-Speech Synthesis

Developing a Thai emotional speech corpus from Lakorn (EMOLA)

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Corpus Based Emotional Speech Synthesis in Hindi

Abstract

Chapter PDF

Similar content being viewed by others

Emotional Speech Datasets for English Speech Synthesis Purpose: A Review

Emotional Prosodic Model Evaluation for Greek Expressive Text-to-Speech Synthesis

Developing a Thai emotional speech corpus from Lakorn (EMOLA)

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation