Abstract
This chapter explores the capabilities of the RPM as outlined in Section for the quality improvement of unit selection voices created with MaryTTS. Therefore, the unit selection approach of MaryTTS is described and methods for the generation of multiple versions of the same utterance are introduced (Section ). The RPM can then be used to choose the best of these alternatives. A listening test is conducted to examine whether some of the generated alternative versions actually feature a superior quality compared to the original MaryTTS output (Section and ). And lastly, different RPM are applied to estimate the quality of the synthesized speech signals (Section and ) and the achieved results are used to specify requirements for the integration of instrumental quality measures (Section ).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
-b and l- mark the starting and ending sounds of the word bQl.
References
Institut für Phonetik und digitale Sprachverarbeitung, Christian-Albrechts-Universität Kiel. The Kiel Corpus of Read Speech, vol I. http://www.ipds.uni-kiel.de/publikationen/kcrsp.de.html. Accessed 15 May 2016
Schiel F (1999) Automatic Phonetic Transcription of Non-prompted Speech. In: Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS 1999), pp 607–610
Reichel UD (2012) PermA and Balloon: Tools for String Alignment and Text Processing. In: Proceedings of the 13th Annual Conference of the International Speech Communication Association (Interspeech 2012), pp 1874–1877
Chan D, Fourcin A, Gibbon D, Grandstrom B, Huckvale M, Kokkonakis G, Kvale K, Lamel L, Lindberg B, Moreno A, Mouropoulos J, Senia F, Trancoso I, Veld C, Zeiliger J (1995) EUROM-A Spoken Language Resource for the EU. In: Proceedings of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), pp 867–870
Hinterleitner F, Möller S, Falk TH, Polzehl T (2010) Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges 2008 and 2009. In: Proceedings of the Blizzard Challenge Workshop, International Speech Communication Association (ISCA)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Hinterleitner, F. (2017). Requirements for the Integration of an Instrumental Quality Measure into a Concatenative TTS System. In: Quality of Synthetic Speech. T-Labs Series in Telecommunication Services. Springer, Singapore. https://doi.org/10.1007/978-981-10-3734-4_7
Download citation
DOI: https://doi.org/10.1007/978-981-10-3734-4_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3733-7
Online ISBN: 978-981-10-3734-4
eBook Packages: EngineeringEngineering (R0)