Audio-Based Pre-classification for Semi-automatic Facial Expression Coding

Böck, Ronald; Limbrecht-Ecklundt, Kerstin; Siegert, Ingo; Walter, Steffen; Wendemuth, Andreas

doi:10.1007/978-3-642-39342-6_33

Ronald Böck¹⁷,
Kerstin Limbrecht-Ecklundt¹⁸,
Ingo Siegert¹⁷,
Steffen Walter¹⁸ &
…
Andreas Wendemuth¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8008))

Included in the following conference series:

International Conference on Human-Computer Interaction

2288 Accesses
4 Citations

Abstract

The automatic classification of the users’ internal affective and emotional states is nowadays to be considered for many applications, ranging from organisational tasks to health care. Developing suitable automatic technical systems, training material is necessary for an appropriate adaptation towards users. In this paper, we present a framework which reduces the manual effort in annotation of emotional states. Mainly it pre-selects video material containing facial expressions for a detailed coding according to the Facial Action Coding System based on audio features, namely prosodic and mel-frequency features. Further, we present results of first experiments which were conducted to give a proof-of-concept and to define the parameters for the classifier that is based on Hidden Markov Models. The experiments were done on the EmoRec I dataset.

Download to read the full chapter text

Chapter PDF

Multimodal Database of Emotional Speech, Video and Gestures

Audio-Visual Continuous Recognition of Emotional State in a Multi-User System Based on Personalized Representation of Facial Expressions and Voice

Article 01 September 2022

PUMAVE-D: panjab university multilingual audio and video facial expression dataset

Article 17 November 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Böck, R., Limbrecht, K., Siegert, I., Glüge, S., Walter, S., Wendemuth, A.: Combining mimic and prosodic analyses for user disposition classification. In: Wolff, M. (ed.) Proceedings of the 23rd Konferenz Elektronische Sprachsignalverarbeitung, Cottbus, Germany, pp. 220–228 (2012)
Google Scholar
Böck, R., Hübner, D., Wendemuth, A.: Determining optimal signal features and parameters for hmm-based emotion classification. In: Proceedings of the 15th IEEE Mediterranean Electrotechnical Conference, pp. 1586–1590. IEEE, Valletta (2010)
Google Scholar
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Google Scholar
Cohn, J.F., Zlochower, A.J., Lien, J., Kanade, T., Analysis, A.F.: Automated face analysis by feature point tracking has high concurrent validity with manual facs coding. Psychophysiology 36(1), 35–43 (1999)
Article Google Scholar
De Looze, C., Oertel, C., Rauzy, S., Campbell, N.: Measuring dynamics of mimicry by means of prosodic cues in conversational speech. In: 17th International Congress of Phonetic Sciences, Hong Kong, China (2011)
Google Scholar
Ekman, P., Friesen, W.: Facial Action Coding System: Investigators Guide, vol. 381. Consulting Psychologists Press, Palo Alto (1978)
Google Scholar
Ekman, P., Friesen, W.: Emfacs facial coding manual. Human Interaction Laboratory, San Francisco (1983)
Google Scholar
Gunes, H., Pantic, M.: Automatic, dimensional and continuous emotion recognition. International Journal of Synthetic Emotions 1(1), 68–99 (2010)
Article Google Scholar
Koelstra, S., Muhl, C., Patras, I.: Eeg analysis for implicit tagging of video data. In: 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops, pp. 1–6. IEEE, Amsterdam (2009)
Google Scholar
Limbrecht-Ecklundt, K., Rukavina, S., Walter, S., Scheck, A., Hrabal, D., Tan, J.W., Traue, H.: The importance of subtle facial expressions for emotion classification in human-computer interaction. Emotional Expression: The Brain and The Face 5(1) ( in press, 2013)
Google Scholar
Mehrabian, A.: Pleasure-arousal-dominance: A general framework for describing and measuring individual differences in Temperament. Current Psychology 14(4), 261–292 (1996)
Article MathSciNet Google Scholar
Pantic, M.: Automatic facial expression analysis and synthesis. In: Symposium on Automatic Facial Expression Analysis and Synthesis, Proceedings Int’l Conf. Measuring Behaviour (MB 2005), pp. 1–2. Wageningen, The Netherlands (2005)
Google Scholar
Scherer, K.R.: Appraisal considered as a process of multilevel sequential checking. In: Appraisal Processes in Emotion: Theory, Methods, Research, pp. 92–120 (2001)
Google Scholar
Schuller, B., Vlasenko, B., Eyben, F., Rigoll, G., Wendemuth, A.: Acoustic emotion recognition: A benchmark comparison of performances. In: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2009, Merano, Italy, pp. 552–557 (2009)
Google Scholar
Schuller, B., Vlasenko, B., Eyben, F., Wollmer, M., Stuhlsatz, A., Wendemuth, A., Rigoll, G.: Cross-corpus acoustic emotion recognition: Variances and strategies. IEEE Transactions on Affective Computing I, 119–131 (2010)
Article Google Scholar
Siegert, I., Böck, R., Philippou-Hübner, D., Vlasenko, B., Wendemuth, A.: Appropriate Emotional Labeling of Non-acted Speech Using Basic Emotions, Geneva Emotion Wheel and Self Assessment Manikins. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2011, Barcelona, Spain (2011)
Google Scholar
Siegert, I., Böck, R., Wendemuth, A.: The influence of context knowledge for multimodal annotation on natural material. In: Böck, R., Bonin, F., Campbell, N., Edlund, J., de Kok, I., Poppe, R., Traum, D. (eds.) Joint Proc. of the IVA 2012 Workshops, Otto von Guericke University Magdeburg, Santa Cruz, USA, pp. 25–32 (2012)
Google Scholar
Soleymani, M., Lichtenauer, J., Pun, T., Pantic, M.: A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing 3(1), 42–55 (2012)
Article Google Scholar
Vlasenko, B., Philippou-Hübner, D., Prylipko, D., Böck, R., Siegert, I., Wendemuth, A.: Vowels formants analysis allows straightforward detection of high arousal emotions. In: 2011 IEEE International Conference on Multimedia and Expo (ICME), Barcelona, Spain (2011)
Google Scholar
Vlasenko, B., Prylipko, D., Böck, R., Wendemuth, A.: Modeling phonetic pattern variability in favor of the creation of robust emotion classifiers for real-life applications. Computer Speech & Language (2012) (in press)
Google Scholar
Walter, S., Scherer, S., Schels, M., Glodek, M., Hrabal, D., Schmidt, M., Böck, R., Limbrecht, K., Traue, H.C., Schwenker, F.: Multimodal emotion classification in naturalistic user behavior. In: Jacko, J.A. (ed.) Human-Computer Interaction, Part III, HCII 2011. LNCS, vol. 6763, pp. 603–611. Springer, Heidelberg (2011)
Chapter Google Scholar
Wendemuth, A., Biundo, S.: A companion technology for cognitive technical systems. In: Esposito, A., Esposito, A.M., Vinciarelli, A., Hoffmann, R., Müller, V.C. (eds.) COST 2102. LNCS, vol. 7403, pp. 89–103. Springer, Heidelberg (2012)
Chapter Google Scholar
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book, version 3.4. Cambridge University Engineering Department (2009)
Google Scholar
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 39–58 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Cognitive Systems Group, Otto von Guericke University Magdeburg, Universitätsplatz 2, 39106, Magdeburg, Germany
Ronald Böck, Ingo Siegert & Andreas Wendemuth
Medical Psychology, Ulm University, Frauensteige 6, 89075, Ulm, Germany
Kerstin Limbrecht-Ecklundt & Steffen Walter

Authors

Ronald Böck
View author publications
You can also search for this author in PubMed Google Scholar
Kerstin Limbrecht-Ecklundt
View author publications
You can also search for this author in PubMed Google Scholar
Ingo Siegert
View author publications
You can also search for this author in PubMed Google Scholar
Steffen Walter
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Wendemuth
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Open University of Japan, 2-11 Wakaba, 261-8586, Chiba-shi, Mihama-ku, Japan
Masaaki Kurosu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Böck, R., Limbrecht-Ecklundt, K., Siegert, I., Walter, S., Wendemuth, A. (2013). Audio-Based Pre-classification for Semi-automatic Facial Expression Coding. In: Kurosu, M. (eds) Human-Computer Interaction. Towards Intelligent and Implicit Interaction. HCI 2013. Lecture Notes in Computer Science, vol 8008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39342-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-39342-6_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39341-9
Online ISBN: 978-3-642-39342-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Audio-Based Pre-classification for Semi-automatic Facial Expression Coding

Abstract

Chapter PDF

Similar content being viewed by others

Multimodal Database of Emotional Speech, Video and Gestures

Audio-Visual Continuous Recognition of Emotional State in a Multi-User System Based on Personalized Representation of Facial Expressions and Voice

PUMAVE-D: panjab university multilingual audio and video facial expression dataset

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Audio-Based Pre-classification for Semi-automatic Facial Expression Coding

Abstract

Chapter PDF

Similar content being viewed by others

Multimodal Database of Emotional Speech, Video and Gestures

Audio-Visual Continuous Recognition of Emotional State in a Multi-User System Based on Personalized Representation of Facial Expressions and Voice

PUMAVE-D: panjab university multilingual audio and video facial expression dataset

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation