Objective and Subjective Evaluation of an Expressive Speech Corpus

Iriondo, Ignasi; Planet, Santiago; Socoró, Joan-Claudi; Alías, Francesc

doi:10.1007/978-3-540-77347-4_5

Ignasi Iriondo¹,
Santiago Planet¹,
Joan-Claudi Socoró¹ &
…
Francesc Alías¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4885))

Included in the following conference series:

International Conference on Nonlinear Speech Processing

601 Accesses
4 Citations

Abstract

This paper presents the validation of the expressiveness of an acted oral corpus produced to be used in speech synthesis. Firstly, an objective validation has been conducted by means of automatic emotion identification techniques using statistical features extracted from the prosodic parameters of speech. Secondly, a listening test has been performed with a subset of utterances. The relationship between both objective and subjective evaluations is analyzed and the obtained conclusions can be useful to improve the following steps related to expressive speech synthesis.

This work has been partially supported by the European Commission, project SALERO FP6 IST-4-027122-IP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, N.: Databases of emotional speech. In: Proceedings of the ISCA Workshop on Speech and Emotion, pp. 34–38 (September 2000)
Google Scholar
Cowie, R., Douglas-Cowie, E., Cox, C.: Beyond emotion archetypes: databases for emotion modelling using neural networks. Neural Networks 18, 371–388 (2005)
Article Google Scholar
Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and machine learning based detection. Neural Networks 18, 407–422 (2005)
Article Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48(9), 1162–1181 (2006)
Article Google Scholar
Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Communication 40, 33–60 (2003)
Article MATH Google Scholar
Schröder, M.: Speech and emotion research: An overview of research frameworks and a dimensional approach to emotional speech synthesis. Ph.D. dissertation, PHONUS 7, Saarland University, Germany (2004)
Google Scholar
Campbell, N.: Developments in corpus-based speech synthesis: Approaching natural conversational speech. IEICE - Trans. Inf. Syst. E88-D(3), 376–383 (2005)
Article Google Scholar
Montoya, N.: El papel de la voz en la publicidad audiovisual dirigida a los niños. Zer. Revista de estudios de comunicación 4, 161–177 (1998)
Google Scholar
François, H., Boëffard, O.: The greedy algorithm and its application to the construction of a continuous speech database. In: Proc. of LREC, Las Palmas de Gran Canaria (Spain), May 2002, vol. 5, pp. 1420–1426 (2002)
Google Scholar
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human computer interaction. IEEE Signal Processing 18(1), 33–80 (2001)
Article Google Scholar
Alías, F., Monzo, C., Socoró, J.C.: A pitch marks filtering algorithm based on restricted dynamic programming. In: Proc. of ICSLP, Pittsburgh (USA), September 2006, pp. 1698–1701 (2006)
Google Scholar
Navas, E., Hernáez, I., Luengo, I.: An Objective and Subjective Study of the Role of Semantics and Prosodic Features in Building Corpora for Emotional TTS. IEEE Trans. on Audio, Speech and Language Processing 14(4), 1117–1127 (2006)
Article Google Scholar
Schweitzer, A., Möbius, B.: On the structure of internal prosodic models. In: Proc. of the 15th ICPhS, Barcelona (Spain), pp. 1301–1304 (2003)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Oudeyer, P.-Y.: The production and recognition of emotions in speech: features and algorithms. Int. Journal of Human Computer Interaction (special issue on Affective Computing) 59(1-2), 157–183 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

GPMM - Grup de Recerca en Processament Multimodal, Enginyeria i Arquitectura La Salle, Universitat Ramon Llull, C/ Quatre Camins 2, 08022 Barcelona, Spain
Ignasi Iriondo, Santiago Planet, Joan-Claudi Socoró & Francesc Alías

Authors

Ignasi Iriondo
View author publications
You can also search for this author in PubMed Google Scholar
Santiago Planet
View author publications
You can also search for this author in PubMed Google Scholar
Joan-Claudi Socoró
View author publications
You can also search for this author in PubMed Google Scholar
Francesc Alías
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Mohamed Chetouani Amir Hussain Bruno Gas Maurice Milgram Jean-Luc Zarader

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Iriondo, I., Planet, S., Socoró, JC., Alías, F. (2007). Objective and Subjective Evaluation of an Expressive Speech Corpus. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, JL. (eds) Advances in Nonlinear Speech Processing. NOLISP 2007. Lecture Notes in Computer Science(), vol 4885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77347-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-77347-4_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77346-7
Online ISBN: 978-3-540-77347-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics