Multi-Modal Fusion Emotion Recognition Based on HMM and ANN

Xu, Chao; Cao, Tianyi; Feng, Zhiyong; Dong, Caichao

doi:10.1007/978-3-642-34447-3_48

Chao Xu⁷,
Tianyi Cao⁷,
Zhiyong Feng⁷ &
…
Caichao Dong⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 332))

Included in the following conference series:

International Conference on E-business Technology and Strategy

2771 Accesses
1 Citations

Abstract

Emotional states play an important role in Human-Computer Interaction. An emotion recognition framework is proposed to extract and fuse features from both video sequences and speech signals. This framework is constructed from two Hidden Markov Models (HMMs) represented to achieve emotional states with video and audio respectively; Artificial Neural Network (ANN) is applied as the whole fusion mechanism. Two important phases for HMMs are Facial Animation Parameters (FAPs) extraction from video sequences based on Active Appearance Model (AAM), and pitch and energy features extraction from speech signals. Experiments indicate that the proposed approach has better performance and robustness than methods using video or audio separately.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford Univ. Press, New York (1994)
Google Scholar
Lyons, M., Akamatsu, S.: Coding facial expressions with gabor wavelets. In: Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205. IEEE Press (1998)
Google Scholar
Franco, L., Treves, A.: A neural network facial expression recognition system using unsupervised local processing. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, pp. 628–632. IEEE Press (2001)
Google Scholar
Bassili, J.: Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. Journal of Personality and Social Psychology 37(11), 2049–2058 (1979)
Article Google Scholar
Ekman, P., Friesen, W.: Facial Action Coding System. Consulting Psychologists Press Inc., California (1978)
Google Scholar
Tian, Y., Kanade, T., Cohn, J.: Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2), 97–115 (2001)
Article Google Scholar
Landabaso, J.L., Pardas, M., Bonafonte, A.: HMM recognition of expressions in unrestrained video intervals. In: Proceedings of IEEE Acoustics, Speech and Signal Processing, vol. 3, pp. III-197–III-200. IEEE Press (2003)
Google Scholar
Medan, Y., Yair, E., Chazan, D.: Super resolution pitch determination of speech signals. IEEE Transactions on Signal Processing 39(1), 40–48 (1991)
Article Google Scholar
Lin, Y., Wei, G.: Speech emotion recognition based on HMM and SVM. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, pp. 4898–4901 (2005)
Google Scholar
Song, M., You, M., Li, N., Chen, C.: A robust multimodal approach for emotion recognition. Neurocomputing 71, 1913–1920 (2008)
Article Google Scholar
Zeng, Z., Tu, J., Brian, M., Huang, T.S.: Audio-visual affective expression recognition through multistream fused HMM. IEEE Transactions on Multimedia 10(4), 570–577 (2008)
Article Google Scholar
Zeng, Z., Tu, J., Liu, M., Zhang, T., Zhang, N., Zhang, Z., Huang, T.S., Roth, D., Levinson, S.: Bimodal HCI-related affect recognition. In: International Conference on Multimodal Interfaces, pp. 137–143 (2004)
Google Scholar
De Silva, L.C.: Bimodal emotion recognition. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 332–335. IEEE Press (2000)
Google Scholar
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: Proceedings of IEEE Workshop on CVPR for Human Communicative Behavior Analysis, pp. 94–101. IEEE Press, San Francisco (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Tianjin University, Tianjin, China
Chao Xu, Tianyi Cao, Zhiyong Feng & Caichao Dong

Authors

Chao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Tianyi Cao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Feng
View author publications
You can also search for this author in PubMed Google Scholar
Caichao Dong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CeBA Canada, Ottawa, ON, Canada
Vasil Khachidze
ITSB, PWGSC, Ottawa, ON, Canada
Tim Wang
Algonquin College, Woodroffe Campus, Ottawa, ON, Canada
Sohail Siddiqui
Macau University of Science and Technology, Taipa, Macau
Vincent Liu
Nationan Research Council, Ottawa, ON, Canada
Sergio Cappuccio
Carleton University, Ottawa, ON, Canda
Alicia Lim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, C., Cao, T., Feng, Z., Dong, C. (2012). Multi-Modal Fusion Emotion Recognition Based on HMM and ANN. In: Khachidze, V., Wang, T., Siddiqui, S., Liu, V., Cappuccio, S., Lim, A. (eds) Contemporary Research on E-business Technology and Strategy. iCETS 2012. Communications in Computer and Information Science, vol 332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34447-3_48

Download citation

DOI: https://doi.org/10.1007/978-3-642-34447-3_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34446-6
Online ISBN: 978-3-642-34447-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics