Towards Emotion Recognition from Speech: Definition, Problems and the Materials of Research

Anagnostopoulos, Christos-Nikolaos; Iliou, Theodoros

doi:10.1007/978-3-642-11684-1_8

Christos-Nikolaos Anagnostopoulos⁶ &
Theodoros Iliou⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 279))

606 Accesses
7 Citations

Abstract

One hundred thirty three (133) sound/speech features extracted from Pitch, Mel Frequency Cepstral Coefficients, Energy and Formants were evaluated in order to create a feature set sufficient to discriminate between seven emotions in acted speech. After the appropriate feature selection, Multilayered Perceptrons were trained for emotion recognition on the basis of a 23-input vector, which provide information about the prosody of the speaker over the entire sentence. Several experiments were performed and the results are presented analytically. Extra emphasis was given to assess the proposed 23-input vector in a speaker independent framework where speakers are not “known” to the classifier. The proposed feature vector achieved promising results (51%) for speaker independent recognition in seven emotion classes. Moreover, considering the problem of classifying high and low arousal emotions, our classifier reaches 86.8% successful recognition. The second classification model incorporated Support Vector Machine with 35 predictive variables. The latter feature vector achieved promising results (78%) for speaker independent recognition in seven emotion classes. Moreover, considering the problem of classifying high and low arousal emotions, our classifier reaches 100 % successful recognition for high arousal and 87% for low arousal emotions. Beside the combination of speech processing and artificial intelligence techniques, new approaches incorporating linguistic semantics could play a critical role to help computers understand human emotions better.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wierzbicka, A.: Emotions across languages and cultures: Diversity and universals. Cambridge University Press, Cambridge (1999)
Book Google Scholar
Software for Predictive Modelling and Forecasting (2009), http://www.dtreg.com/
Cowie, R., Cowie, E.D., Cox, C.: Beyond emotion archetypes: databases for emotion modelling using neural networks. Neural Networks 18(4), 371–388 (2005)
Article Google Scholar
Chen, L.S., Huang, T.S.: Emotional expressions in audiovisual human computer interaction. In: Proc. of International Conference of Multimedia and Expo (ICME), pp. 423–426 (2000)
Google Scholar
Chen, L.S., Huang, T.S., Miyasato, T., Nakatsu, R.: Multimodal human emotion/expression recognition. In: Proc. of 3rd IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 396–401 (1998)
Google Scholar
De Silva, L.C., Ng, P.C.: Bimodal emotion recognition. In: Proc. of 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG), pp. 332–335 (2000)
Google Scholar
Yoshitomi, Y., Kim, S., Kawano, T., Kitazoe, T.: Effect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face. In: Proc. of 9th IEEE International Workshop on Robot and Human Interactive Communication, pp. 178–183 (2000)
Google Scholar
Ekman, P.: Universals and Cultural Differences in Facial Expression of Emotion. In: Cole, J.R. (ed.) Motivation. University of Nebraska Press (1972)
Google Scholar
Ekman, P.: An Argument for Basic Emotions. Cognition and Emotion 6(3), 169–200 (1972)
Google Scholar
Izard, C.E.: The Face of Emotion. Appleton-Century-Crofts, New York (1971)
Google Scholar
Izard, C.E.: Basic Emotions, Relations among Emotions and Emotion – Cognition Relations. Psychological Review 99, 561–565 (1992)
Article Google Scholar
Tomkins, S.S.: Affect, Imagery, Consciousness: The Positive Affects. Springer, New York (1962)
Google Scholar
Tomkins, S.S.: Affect Theory. In: Scherer, K.R., et al. (eds.) Approaches to Emotion. Erlbaum, Hillsdale (1984)
Google Scholar
Fontaine, J.R.J., Scherer, K.R., Roesch, E.B., Ellsworth, P.C.: The world of emotions is not two dimensional. Psychological Sciences 18(12), 1050–1057 (2007)
Article Google Scholar
Ververidis, D., Kotropoulos, C.: A State of the Art Review on Emotional Speech Databases. In: Proc. of the 1st Richmedia Conference, pp. 109–119 (2003)
Google Scholar
Kim, S., Georgiou, P., Lee, S., Narayanan, S.: Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In: Proc. of IEEE Multimedia Signal Processing Workshop, pp. 48–51 (2007)
Google Scholar
Morrison, D., Wang, R., De Silva, L.C.: Ensemble methods for spoken emotion recognition in call-centres. Speech Communication 49, 98–112 (2007)
Article Google Scholar
Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human–computer dialog. In: Proc. of the International Conference on Spoken Language Processing (ICSLP), pp. 2037–2040 (2002)
Google Scholar
Petrushin, V.: Emotion recognition in speech signal: experimental study, development, and application. In: Proc. of the 6th International Conference on Spoken Language Processing (ICSLP), pp. 222–225 (2000)
Google Scholar
Bänziger, T., Scherer, K.R.: The role of intonation in emotional expression. Speech Communication 46, 252–267 (2005)
Article Google Scholar
Abdulla, W.H., Kasabov, N.K.: Improving speech recognition performance through gender separation. In: Proc. of the 5th Biannual Conference on Artificial Neural Networks and Expert Systems (ANNES), pp. 218–222 (2001)
Google Scholar
Wang, Y., Guan, L.: Recognizing human emotion from audiovisual information. In: Proc. of International Conference on Acoustic and Signal Processing (ICASP), pp. 1125–1128 (2005)
Google Scholar
Vogt, T., Andre, E.: Improving Automatic Emotion Recognition from Speech via Gender Differentiation. In: Proc. of Language Resources and Evaluation Conference (LREC), pp. 1123–1126 (2006)
Google Scholar
Kostoulas, T.P., Fakotakis, N.: A Speaker Dependent Emotion Recognition Framework. In: Proc. of Fifth International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP), pp. 305–309 (2006)
Google Scholar
Fingerhut, M.: Music Information Retrieval, or how to search for (and maybe find) music and do away with incipits. In: International Association of Music Libraries, Archives and Documentation Centers (IAML) and the International Association of Sound and Audiovisual Archives (IASA), IAML-IASA Congress (2004)
Google Scholar
Boersma, P., Weenik, D.: Praat, a system for doing phonetics by computer, Technical Report 132, Inst Phonetic Sciences, Univ. Amsterdam (2003), www.praat.org
Scherer, K.R.: Vocal communication of emotion: a review of research paradigms. Speech Communication 40, 227–256 (2003)
Article MATH Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs (1978)
Google Scholar
Lee, C.M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., Narayanan, S.: Emotion recognition based on phoneme classes. In: Proc. of the International Conference on Spoken Language Processing, ICSLP (2004)
Google Scholar
Waikato Environment for Knowledge Analysis, WEKA (2006), http://www.cs.waikato.ac.nz/ml/weka/
Kostoulas, T., Ganchev, T., Fakotakis, N.: Study on speaker-independent emotion recognition from speech on real-world data. In: Esposito, A., Bourbakis, N.G., Avouris, N., Hatzilygeroudis, I. (eds.) HH and HM Interaction. LNCS (LNAI), vol. 5042, pp. 235–242. Springer, Heidelberg (2008)
Chapter Google Scholar
Hozjan, V., Kacic, Z.: Context-independent multilingual emotion recognition from speech signals. International Journal of Speech Technology 6, 311–320 (2006)
Article Google Scholar
Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W., Taylor, J.G.: Emotion recognition in human-computer interaction. IEEE Signal Processing Magazine 18, 32–80 (2001)
Article Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A database of German emotional speech. In: Proc. of Interspeech, pp. 1515–1520 (2005)
Google Scholar
Murray, I.R., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. Journal of Acoustic Society America 93(2), 1097–1108 (1993)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cultural Technology and Communication Department, University of the Aegean, Mytilene, Lesvos Island, GR-81100
Christos-Nikolaos Anagnostopoulos & Theodoros Iliou

Authors

Christos-Nikolaos Anagnostopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Theodoros Iliou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Peloponnese, Tripolis, Greece
Manolis Wallace
University of the Aegean, Samos, Greece
Ioannis E. Anagnostopoulos
National Technical University of Athens , Athens, Greece
Phivos Mylonas
Slovak University of Technology in Bratislava, Bratislava, Slovakia
Maria Bielikova

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Anagnostopoulos, CN., Iliou, T. (2010). Towards Emotion Recognition from Speech: Definition, Problems and the Materials of Research. In: Wallace, M., Anagnostopoulos, I.E., Mylonas, P., Bielikova, M. (eds) Semantics in Adaptive and Personalized Services. Studies in Computational Intelligence, vol 279. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11684-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-11684-1_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11683-4
Online ISBN: 978-3-642-11684-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics