Abstract
Emotional states play an important role in Human-Computer Interaction. An emotion recognition framework is proposed to extract and fuse features from both video sequences and speech signals. This framework is constructed from two Hidden Markov Models (HMMs) represented to achieve emotional states with video and audio respectively; Artificial Neural Network (ANN) is applied as the whole fusion mechanism. Two important phases for HMMs are Facial Animation Parameters (FAPs) extraction from video sequences based on Active Appearance Model (AAM), and pitch and energy features extraction from speech signals. Experiments indicate that the proposed approach has better performance and robustness than methods using video or audio separately.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford Univ. Press, New York (1994)
Lyons, M., Akamatsu, S.: Coding facial expressions with gabor wavelets. In: Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205. IEEE Press (1998)
Franco, L., Treves, A.: A neural network facial expression recognition system using unsupervised local processing. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, pp. 628–632. IEEE Press (2001)
Bassili, J.: Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. Journal of Personality and Social Psychology 37(11), 2049–2058 (1979)
Ekman, P., Friesen, W.: Facial Action Coding System. Consulting Psychologists Press Inc., California (1978)
Tian, Y., Kanade, T., Cohn, J.: Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2), 97–115 (2001)
Landabaso, J.L., Pardas, M., Bonafonte, A.: HMM recognition of expressions in unrestrained video intervals. In: Proceedings of IEEE Acoustics, Speech and Signal Processing, vol. 3, pp. III-197–III-200. IEEE Press (2003)
Medan, Y., Yair, E., Chazan, D.: Super resolution pitch determination of speech signals. IEEE Transactions on Signal Processing 39(1), 40–48 (1991)
Lin, Y., Wei, G.: Speech emotion recognition based on HMM and SVM. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, pp. 4898–4901 (2005)
Song, M., You, M., Li, N., Chen, C.: A robust multimodal approach for emotion recognition. Neurocomputing 71, 1913–1920 (2008)
Zeng, Z., Tu, J., Brian, M., Huang, T.S.: Audio-visual affective expression recognition through multistream fused HMM. IEEE Transactions on Multimedia 10(4), 570–577 (2008)
Zeng, Z., Tu, J., Liu, M., Zhang, T., Zhang, N., Zhang, Z., Huang, T.S., Roth, D., Levinson, S.: Bimodal HCI-related affect recognition. In: International Conference on Multimodal Interfaces, pp. 137–143 (2004)
De Silva, L.C.: Bimodal emotion recognition. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 332–335. IEEE Press (2000)
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: Proceedings of IEEE Workshop on CVPR for Human Communicative Behavior Analysis, pp. 94–101. IEEE Press, San Francisco (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, C., Cao, T., Feng, Z., Dong, C. (2012). Multi-Modal Fusion Emotion Recognition Based on HMM and ANN. In: Khachidze, V., Wang, T., Siddiqui, S., Liu, V., Cappuccio, S., Lim, A. (eds) Contemporary Research on E-business Technology and Strategy. iCETS 2012. Communications in Computer and Information Science, vol 332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34447-3_48
Download citation
DOI: https://doi.org/10.1007/978-3-642-34447-3_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34446-6
Online ISBN: 978-3-642-34447-3
eBook Packages: Computer ScienceComputer Science (R0)