Requirements for the Integration of an Instrumental Quality Measure into a Concatenative TTS System

Hinterleitner, Florian

doi:10.1007/978-981-10-3734-4_7

Florian Hinterleitner⁵

Part of the book series: T-Labs Series in Telecommunication Services ((TLABS))

441 Accesses

Abstract

This chapter explores the capabilities of the RPM as outlined in Section for the quality improvement of unit selection voices created with MaryTTS. Therefore, the unit selection approach of MaryTTS is described and methods for the generation of multiple versions of the same utterance are introduced (Section ). The RPM can then be used to choose the best of these alternatives. A listening test is conducted to examine whether some of the generated alternative versions actually feature a superior quality compared to the original MaryTTS output (Section and ). And lastly, different RPM are applied to estimate the quality of the synthesized speech signals (Section and ) and the achieved results are used to specify requirements for the integration of instrumental quality measures (Section ).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
-b and l- mark the starting and ending sounds of the word bQl.

References

Institut für Phonetik und digitale Sprachverarbeitung, Christian-Albrechts-Universität Kiel. The Kiel Corpus of Read Speech, vol I. http://www.ipds.uni-kiel.de/publikationen/kcrsp.de.html. Accessed 15 May 2016
Schiel F (1999) Automatic Phonetic Transcription of Non-prompted Speech. In: Proceedings of the 14th International Congress of Phonetic Sciences (ICPhS 1999), pp 607–610
Google Scholar
Reichel UD (2012) PermA and Balloon: Tools for String Alignment and Text Processing. In: Proceedings of the 13th Annual Conference of the International Speech Communication Association (Interspeech 2012), pp 1874–1877
Google Scholar
Chan D, Fourcin A, Gibbon D, Grandstrom B, Huckvale M, Kokkonakis G, Kvale K, Lamel L, Lindberg B, Moreno A, Mouropoulos J, Senia F, Trancoso I, Veld C, Zeiliger J (1995) EUROM-A Spoken Language Resource for the EU. In: Proceedings of the 4th European Conference on Speech Communication and Technology (EUROSPEECH 1995), pp 867–870
Google Scholar
Hinterleitner F, Möller S, Falk TH, Polzehl T (2010) Comparison of Approaches for Instrumentally Predicting the Quality of Text-to-Speech Systems: Data from Blizzard Challenges 2008 and 2009. In: Proceedings of the Blizzard Challenge Workshop, International Speech Communication Association (ISCA)
Google Scholar

Download references

Author information

Authors and Affiliations

Quality and Usability Lab, Institute of Software Engineering and Theoretical Computer Science, Berlin Institute of Technology, Berlin, Germany
Florian Hinterleitner

Authors

Florian Hinterleitner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florian Hinterleitner .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hinterleitner, F. (2017). Requirements for the Integration of an Instrumental Quality Measure into a Concatenative TTS System. In: Quality of Synthetic Speech. T-Labs Series in Telecommunication Services. Springer, Singapore. https://doi.org/10.1007/978-981-10-3734-4_7

Download citation

DOI: https://doi.org/10.1007/978-981-10-3734-4_7
Published: 09 April 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3733-7
Online ISBN: 978-981-10-3734-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics