Skip to main content

Multi-Modal Fusion Emotion Recognition Based on HMM and ANN

  • Conference paper
Contemporary Research on E-business Technology and Strategy (iCETS 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 332))

Included in the following conference series:

Abstract

Emotional states play an important role in Human-Computer Interaction. An emotion recognition framework is proposed to extract and fuse features from both video sequences and speech signals. This framework is constructed from two Hidden Markov Models (HMMs) represented to achieve emotional states with video and audio respectively; Artificial Neural Network (ANN) is applied as the whole fusion mechanism. Two important phases for HMMs are Facial Animation Parameters (FAPs) extraction from video sequences based on Active Appearance Model (AAM), and pitch and energy features extraction from speech signals. Experiments indicate that the proposed approach has better performance and robustness than methods using video or audio separately.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford Univ. Press, New York (1994)

    Google Scholar 

  2. Lyons, M., Akamatsu, S.: Coding facial expressions with gabor wavelets. In: Third IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200–205. IEEE Press (1998)

    Google Scholar 

  3. Franco, L., Treves, A.: A neural network facial expression recognition system using unsupervised local processing. In: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis, pp. 628–632. IEEE Press (2001)

    Google Scholar 

  4. Bassili, J.: Emotion recognition: the role of facial movement and the relative importance of upper and lower areas of the face. Journal of Personality and Social Psychology 37(11), 2049–2058 (1979)

    Article  Google Scholar 

  5. Ekman, P., Friesen, W.: Facial Action Coding System. Consulting Psychologists Press Inc., California (1978)

    Google Scholar 

  6. Tian, Y., Kanade, T., Cohn, J.: Recognizing action units for facial expression analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(2), 97–115 (2001)

    Article  Google Scholar 

  7. Landabaso, J.L., Pardas, M., Bonafonte, A.: HMM recognition of expressions in unrestrained video intervals. In: Proceedings of IEEE Acoustics, Speech and Signal Processing, vol. 3, pp. III-197–III-200. IEEE Press (2003)

    Google Scholar 

  8. Medan, Y., Yair, E., Chazan, D.: Super resolution pitch determination of speech signals. IEEE Transactions on Signal Processing 39(1), 40–48 (1991)

    Article  Google Scholar 

  9. Lin, Y., Wei, G.: Speech emotion recognition based on HMM and SVM. In: Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, pp. 4898–4901 (2005)

    Google Scholar 

  10. Song, M., You, M., Li, N., Chen, C.: A robust multimodal approach for emotion recognition. Neurocomputing 71, 1913–1920 (2008)

    Article  Google Scholar 

  11. Zeng, Z., Tu, J., Brian, M., Huang, T.S.: Audio-visual affective expression recognition through multistream fused HMM. IEEE Transactions on Multimedia 10(4), 570–577 (2008)

    Article  Google Scholar 

  12. Zeng, Z., Tu, J., Liu, M., Zhang, T., Zhang, N., Zhang, Z., Huang, T.S., Roth, D., Levinson, S.: Bimodal HCI-related affect recognition. In: International Conference on Multimodal Interfaces, pp. 137–143 (2004)

    Google Scholar 

  13. De Silva, L.C.: Bimodal emotion recognition. In: Proceedings of the Fourth IEEE International Conference on Automatic Face and Gesture Recognition, pp. 332–335. IEEE Press (2000)

    Google Scholar 

  14. Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The Extended Cohn-Kanade Dataset (CK+): A complete dataset for action unit and emotion-specified expression. In: Proceedings of IEEE Workshop on CVPR for Human Communicative Behavior Analysis, pp. 94–101. IEEE Press, San Francisco (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, C., Cao, T., Feng, Z., Dong, C. (2012). Multi-Modal Fusion Emotion Recognition Based on HMM and ANN. In: Khachidze, V., Wang, T., Siddiqui, S., Liu, V., Cappuccio, S., Lim, A. (eds) Contemporary Research on E-business Technology and Strategy. iCETS 2012. Communications in Computer and Information Science, vol 332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34447-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34447-3_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34446-6

  • Online ISBN: 978-3-642-34447-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics