Technical and Phonetic Aspects of Speech Quality Assessment: The Case of Prosody Synthesis

Tučková, Jana; Holub, Jan; Duběda, Tomáš

doi:10.1007/978-3-642-03320-9_13

Jana Tučková²¹,
Jan Holub²² &
Tomáš Duběda²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5641))

1578 Accesses

Abstract

The present paper proposes a discussion of methods used for subjective assessment of speech quality in technical sciences and in linguistics. Stressing the fact that purely mathematical evaluation of synthetic speech is not sufficient, we try to show that the perspectives and approaches used in the two scientific domains are not necessarily the same. Next we proceed to a pilot experiment consisting in the assessment of synthetic sentences as generated by five different prosodic models, by means of the MOS formalism (ITU-T P.800). The two groups of listeners involved (students of engineering vs. linguistics) provide rather similar results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ITU-T P.800. Telecommunication standardization sector of ITU, Methods for objective and subjective assessment of transmission quality, ITU (1996)
Google Scholar
Hoene, C.: Internet Telephony over Wireless Links. Ph.D Thesis. TU Berlin (2005)
Google Scholar
Pisoni, D.B., Remez, R.E. (eds.): The Handbook of Speech Perception. Blackwell Publishing, Malden (2005)
Google Scholar
Kohler, K.J.: Paradigms in experimental prosodic analysis: from measurement to function. In: Sudhoff, S., et al. (eds.) Methods in Empirical Prosody Research, pp. 123–152. Walter de Gruyter, Berlin (2006)
Google Scholar
COCOSDA International Committee for Co-ordination and Standardisation of Speech Databases, http://www.cocosda.org/
Matoušek, J., Tihelka, D., Romportl, J.: Current state of Czech text-to-speech system ARTIC. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 439–446. Springer, Heidelberg (2006)
Chapter Google Scholar
Riedi, M.: A neural-network-based model of segmental duration for speech synthesis. In: Proc. Eurospeech 1995, vol. 1, pp. 599–602. European Speech Communication Association (1997)
Google Scholar
Traber, C.: F₀ generation with a database of natural F₀ patterns and with a neural network. In: Bailly, G., Benoît, C., Sawallis, T.R. (eds.) Talking Machines: Theories, Models, and Design, pp. 287–304. Elsevier Science Publishers, Amsterdam (1992)
Google Scholar
Hájek, P., Sochorová, A., Zvárová, J.: GUHA for personal computers. Computational Statistics and Data Analysis 19, 149–153 (1995)
Article MATH Google Scholar
Tučková, J., Šebesta, V.: The Prosody Optimisation of the Czech Language Synthesizer. In: Novák, M. (ed.) International Journal on Neural and Mass-Parallel Computing and Information Systems “Neural Network World", ICS AS CR and CTU, FTS, vol. 18(4), pp. 291–308 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Electrical Engineering, Dept. of Circuit Theory, Czech Technical University, Technická 2, 166 27, Prague 6, Czech Republic
Jana Tučková
Faculty of Electrical Engineering, Dept. of Measurement, Czech Technical University, Technická 2, 166 27, Prague 6, Czech Republic
Jan Holub
Faculty of Arts and Philosophy, Institute of Translation Studies, Charles University in Prague, Hybernská 3, 110 00, Praha 1
Tomáš Duběda

Authors

Jana Tučková
View author publications
You can also search for this author in PubMed Google Scholar
Jan Holub
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Duběda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Psychology, Second University of Naples, and IIASS, Via G. Pellegrino 19, 84019, Vietri sul Mare, (SA), Italy
Anna Esposito
Institute of Photonics and Electronics, Academy of Sciences of the Czech Republic, Chaberská 57, 182 52, Prague 8, Czech Republic
Robert Vích

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tučková, J., Holub, J., Duběda, T. (2009). Technical and Phonetic Aspects of Speech Quality Assessment: The Case of Prosody Synthesis. In: Esposito, A., Vích, R. (eds) Cross-Modal Analysis of Speech, Gestures, Gaze and Facial Expressions. Lecture Notes in Computer Science(), vol 5641. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03320-9_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-03320-9_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03319-3
Online ISBN: 978-3-642-03320-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics