Skip to main content

Objective and Subjective Evaluation of an Expressive Speech Corpus

  • Conference paper
Advances in Nonlinear Speech Processing (NOLISP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4885))

Included in the following conference series:

Abstract

This paper presents the validation of the expressiveness of an acted oral corpus produced to be used in speech synthesis. Firstly, an objective validation has been conducted by means of automatic emotion identification techniques using statistical features extracted from the prosodic parameters of speech. Secondly, a listening test has been performed with a subset of utterances. The relationship between both objective and subjective evaluations is analyzed and the obtained conclusions can be useful to improve the following steps related to expressive speech synthesis.

This work has been partially supported by the European Commission, project SALERO FP6 IST-4-027122-IP.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Campbell, N.: Databases of emotional speech. In: Proceedings of the ISCA Workshop on Speech and Emotion, pp. 34–38 (September 2000)

    Google Scholar 

  2. Cowie, R., Douglas-Cowie, E., Cox, C.: Beyond emotion archetypes: databases for emotion modelling using neural networks. Neural Networks 18, 371–388 (2005)

    Article  Google Scholar 

  3. Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and machine learning based detection. Neural Networks 18, 407–422 (2005)

    Article  Google Scholar 

  4. Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48(9), 1162–1181 (2006)

    Article  Google Scholar 

  5. Douglas-Cowie, E., Campbell, N., Cowie, R., Roach, P.: Emotional speech: towards a new generation of databases. Speech Communication 40, 33–60 (2003)

    Article  MATH  Google Scholar 

  6. Schröder, M.: Speech and emotion research: An overview of research frameworks and a dimensional approach to emotional speech synthesis. Ph.D. dissertation, PHONUS 7, Saarland University, Germany (2004)

    Google Scholar 

  7. Campbell, N.: Developments in corpus-based speech synthesis: Approaching natural conversational speech. IEICE - Trans. Inf. Syst. E88-D(3), 376–383 (2005)

    Article  Google Scholar 

  8. Montoya, N.: El papel de la voz en la publicidad audiovisual dirigida a los niños. Zer. Revista de estudios de comunicación 4, 161–177 (1998)

    Google Scholar 

  9. François, H., Boëffard, O.: The greedy algorithm and its application to the construction of a continuous speech database. In: Proc. of LREC, Las Palmas de Gran Canaria (Spain), May 2002, vol. 5, pp. 1420–1426 (2002)

    Google Scholar 

  10. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human computer interaction. IEEE Signal Processing 18(1), 33–80 (2001)

    Article  Google Scholar 

  11. Alías, F., Monzo, C., Socoró, J.C.: A pitch marks filtering algorithm based on restricted dynamic programming. In: Proc. of ICSLP, Pittsburgh (USA), September 2006, pp. 1698–1701 (2006)

    Google Scholar 

  12. Navas, E., Hernáez, I., Luengo, I.: An Objective and Subjective Study of the Role of Semantics and Prosodic Features in Building Corpora for Emotional TTS. IEEE Trans. on Audio, Speech and Language Processing 14(4), 1117–1127 (2006)

    Article  Google Scholar 

  13. Schweitzer, A., Möbius, B.: On the structure of internal prosodic models. In: Proc. of the 15th ICPhS, Barcelona (Spain), pp. 1301–1304 (2003)

    Google Scholar 

  14. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  15. Oudeyer, P.-Y.: The production and recognition of emotions in speech: features and algorithms. Int. Journal of Human Computer Interaction (special issue on Affective Computing) 59(1-2), 157–183 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mohamed Chetouani Amir Hussain Bruno Gas Maurice Milgram Jean-Luc Zarader

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Iriondo, I., Planet, S., Socoró, JC., Alías, F. (2007). Objective and Subjective Evaluation of an Expressive Speech Corpus. In: Chetouani, M., Hussain, A., Gas, B., Milgram, M., Zarader, JL. (eds) Advances in Nonlinear Speech Processing. NOLISP 2007. Lecture Notes in Computer Science(), vol 4885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77347-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77347-4_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77346-7

  • Online ISBN: 978-3-540-77347-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics