Skip to main content

Towards Emotion Recognition from Speech: Definition, Problems and the Materials of Research

  • Chapter
Semantics in Adaptive and Personalized Services

Part of the book series: Studies in Computational Intelligence ((SCI,volume 279))

Abstract

One hundred thirty three (133) sound/speech features extracted from Pitch, Mel Frequency Cepstral Coefficients, Energy and Formants were evaluated in order to create a feature set sufficient to discriminate between seven emotions in acted speech. After the appropriate feature selection, Multilayered Perceptrons were trained for emotion recognition on the basis of a 23-input vector, which provide information about the prosody of the speaker over the entire sentence. Several experiments were performed and the results are presented analytically. Extra emphasis was given to assess the proposed 23-input vector in a speaker independent framework where speakers are not “known” to the classifier. The proposed feature vector achieved promising results (51%) for speaker independent recognition in seven emotion classes. Moreover, considering the problem of classifying high and low arousal emotions, our classifier reaches 86.8% successful recognition. The second classification model incorporated Support Vector Machine with 35 predictive variables. The latter feature vector achieved promising results (78%) for speaker independent recognition in seven emotion classes. Moreover, considering the problem of classifying high and low arousal emotions, our classifier reaches 100 % successful recognition for high arousal and 87% for low arousal emotions. Beside the combination of speech processing and artificial intelligence techniques, new approaches incorporating linguistic semantics could play a critical role to help computers understand human emotions better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wierzbicka, A.: Emotions across languages and cultures: Diversity and universals. Cambridge University Press, Cambridge (1999)

    Book  Google Scholar 

  2. Software for Predictive Modelling and Forecasting (2009), http://www.dtreg.com/

  3. Cowie, R., Cowie, E.D., Cox, C.: Beyond emotion archetypes: databases for emotion modelling using neural networks. Neural Networks 18(4), 371–388 (2005)

    Article  Google Scholar 

  4. Chen, L.S., Huang, T.S.: Emotional expressions in audiovisual human computer interaction. In: Proc. of International Conference of Multimedia and Expo (ICME), pp. 423–426 (2000)

    Google Scholar 

  5. Chen, L.S., Huang, T.S., Miyasato, T., Nakatsu, R.: Multimodal human emotion/expression recognition. In: Proc. of 3rd IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 396–401 (1998)

    Google Scholar 

  6. De Silva, L.C., Ng, P.C.: Bimodal emotion recognition. In: Proc. of 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 332–335 (2000)

    Google Scholar 

  7. Yoshitomi, Y., Kim, S., Kawano, T., Kitazoe, T.: Effect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face. In: Proc. of 9th IEEE International Workshop on Robot and Human Interactive Communication, pp. 178–183 (2000)

    Google Scholar 

  8. Ekman, P.: Universals and Cultural Differences in Facial Expression of Emotion. In: Cole, J.R. (ed.) Motivation. University of Nebraska Press (1972)

    Google Scholar 

  9. Ekman, P.: An Argument for Basic Emotions. Cognition and Emotion 6(3), 169–200 (1972)

    Google Scholar 

  10. Izard, C.E.: The Face of Emotion. Appleton-Century-Crofts, New York (1971)

    Google Scholar 

  11. Izard, C.E.: Basic Emotions, Relations among Emotions and Emotion – Cognition Relations. Psychological Review 99, 561–565 (1992)

    Article  Google Scholar 

  12. Tomkins, S.S.: Affect, Imagery, Consciousness: The Positive Affects. Springer, New York (1962)

    Google Scholar 

  13. Tomkins, S.S.: Affect Theory. In: Scherer, K.R., et al. (eds.) Approaches to Emotion. Erlbaum, Hillsdale (1984)

    Google Scholar 

  14. Fontaine, J.R.J., Scherer, K.R., Roesch, E.B., Ellsworth, P.C.: The world of emotions is not two dimensional. Psychological Sciences 18(12), 1050–1057 (2007)

    Article  Google Scholar 

  15. Ververidis, D., Kotropoulos, C.: A State of the Art Review on Emotional Speech Databases. In: Proc. of the 1st Richmedia Conference, pp. 109–119 (2003)

    Google Scholar 

  16. Kim, S., Georgiou, P., Lee, S., Narayanan, S.: Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In: Proc. of IEEE Multimedia Signal Processing Workshop, pp. 48–51 (2007)

    Google Scholar 

  17. Morrison, D., Wang, R., De Silva, L.C.: Ensemble methods for spoken emotion recognition in call-centres. Speech Communication 49, 98–112 (2007)

    Article  Google Scholar 

  18. Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human–computer dialog. In: Proc. of the International Conference on Spoken Language Processing (ICSLP), pp. 2037–2040 (2002)

    Google Scholar 

  19. Petrushin, V.: Emotion recognition in speech signal: experimental study, development, and application. In: Proc. of the 6th International Conference on Spoken Language Processing (ICSLP), pp. 222–225 (2000)

    Google Scholar 

  20. Bänziger, T., Scherer, K.R.: The role of intonation in emotional expression. Speech Communication 46, 252–267 (2005)

    Article  Google Scholar 

  21. Abdulla, W.H., Kasabov, N.K.: Improving speech recognition performance through gender separation. In: Proc. of the 5th Biannual Conference on Artificial Neural Networks and Expert Systems (ANNES), pp. 218–222 (2001)

    Google Scholar 

  22. Wang, Y., Guan, L.: Recognizing human emotion from audiovisual information. In: Proc. of International Conference on Acoustic and Signal Processing (ICASP), pp. 1125–1128 (2005)

    Google Scholar 

  23. Vogt, T., Andre, E.: Improving Automatic Emotion Recognition from Speech via Gender Differentiation. In: Proc. of Language Resources and Evaluation Conference (LREC), pp. 1123–1126 (2006)

    Google Scholar 

  24. Kostoulas, T.P., Fakotakis, N.: A Speaker Dependent Emotion Recognition Framework. In: Proc. of Fifth International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP), pp. 305–309 (2006)

    Google Scholar 

  25. Fingerhut, M.: Music Information Retrieval, or how to search for (and maybe find) music and do away with incipits. In: International Association of Music Libraries, Archives and Documentation Centers (IAML) and the International Association of Sound and Audiovisual Archives (IASA), IAML-IASA Congress (2004)

    Google Scholar 

  26. Boersma, P., Weenik, D.: Praat, a system for doing phonetics by computer, Technical Report 132, Inst Phonetic Sciences, Univ. Amsterdam (2003), www.praat.org

  27. Scherer, K.R.: Vocal communication of emotion: a review of research paradigms. Speech Communication 40, 227–256 (2003)

    Article  MATH  Google Scholar 

  28. Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)

    Google Scholar 

  29. Lee, C.M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., Narayanan, S.: Emotion recognition based on phoneme classes. In: Proc. of the International Conference on Spoken Language Processing, ICSLP (2004)

    Google Scholar 

  30. Waikato Environment for Knowledge Analysis, WEKA (2006), http://www.cs.waikato.ac.nz/ml/weka/

  31. Kostoulas, T., Ganchev, T., Fakotakis, N.: Study on speaker-independent emotion recognition from speech on real-world data. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds.) HH and HM Interaction. LNCS (LNAI), vol. 5042, pp. 235–242. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  32. Hozjan, V., Kacic, Z.: Context-independent multilingual emotion recognition from speech signals. International Journal of Speech Technology 6, 311–320 (2006)

    Article  Google Scholar 

  33. Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18, 32–80 (2001)

    Article  Google Scholar 

  34. Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: Proc. of Interspeech, pp. 1515–1520 (2005)

    Google Scholar 

  35. Murray, I.R., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of Acoustic Society America 93(2), 1097–1108 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Anagnostopoulos, CN., Iliou, T. (2010). Towards Emotion Recognition from Speech: Definition, Problems and the Materials of Research. In: Wallace, M., Anagnostopoulos, I.E., Mylonas, P., Bielikova, M. (eds) Semantics in Adaptive and Personalized Services. Studies in Computational Intelligence, vol 279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11684-1_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11684-1_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11683-4

  • Online ISBN: 978-3-642-11684-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics