Skip to main content

Multimodal Fusion Using Kernel-Based ELM for Video Emotion Recognition

  • Conference paper
  • First Online:

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 6))

Abstract

This paper presents a multimodal fusion approach using kernel-based Extreme Learning Machine (ELM) for video emotion recognition by combing video content and electroencephalogram (EEG) signals. Firstly, several audio-based features and visual-based features are extracted from video clips and EEG features are obtained by using Wavelet Packet Decomposition (WPD). Secondly, video features are selected using Double Input Symmetrical Relevance (DISR) and EEG features are selected by Decision Tree (DT). Thirdly, multimodal fusion using kernel-based ELM is adopted for classification by combing video and EEG features at decision-level. In order to test the validity of the proposed method, we design and conduct the EEG experiment to collect data that consisted of video clips and EEG signals of subjects. We compare our method separately with single mode methods of using video content only and EEG signals only on classification accuracy. The experimental results show that the proposed fusion method produces better classification performance than those of the video emotion recognition methods which use either video content or EEG signals alone.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Lin, J., Sun, Y., Wang, W.: Violence detection in movies with auditory and visual cues. In: Proceedings of the 2010 International Conference on Computational Intelligence and Security. IEEE Computer Society, pp. 561–565 (2010)

    Google Scholar 

  2. Nie, D., Wang, X.W., Shi, L.C., et al.: EEG-based emotion recognition during watching movies. Int. IEEE/EMBS Conf. Neural Eng. 1359, 667–670 (2011)

    Google Scholar 

  3. Bailenson, J.N., et al.: Real-time classification of evoked emotions using facial feature tracking and physiological responses. Int J Hum Mach Stud 66(5), 303–317 (2008)

    Article  Google Scholar 

  4. Mansoorizadeh, M.: Multimodal information fusion application to human emotion recognition from face and speech. Multimed. Tools Appl. 49(2), 277–297 (2010)

    Article  Google Scholar 

  5. Koelstra, R.A.L.S.: Affective and implicit tagging using facial expressions and electroencephalography. Queen Mary University of London (2012)

    Google Scholar 

  6. Ye, W., Fan, X.: Bimodal emotion recogition from speech and text. Int. J. Adv. Comput. Sci. Appl. 5(2), 26–29 (2014)

    Google Scholar 

  7. Wang, S., Zhu, Y., Wu, G., et al.: Hybrid video emotional tagging using users’ EEG and video content. Multimed. Tools Appl. 72(2), 1257–1283 (2014)

    Article  Google Scholar 

  8. Chuang, Z.J., Wu, C.H.: Multi-modal emotion recognition from speech and text. Int. J. Comput. Linguist. Chin. Lang. Process. 1, 779–783 (2004)

    Google Scholar 

  9. Pantic, M., Caridakis, G., André, E., et al.: Multimodal emotion recognition from low-level cues. Cognit. Technol. 115–132 (2011)

    Google Scholar 

  10. Sun, K., Yu, J.: Video affective content representation and recognition using video affective tree and hidden markov models. Lecture Notes Comput. Sci. 594–605 (2007)

    Google Scholar 

  11. Jasmine, K.P., Kumar, P.R.: Integration of HSV color histogram and LMEBP joint histogram for multimedia image retrieval. Adv. Intell. Syst. Comput. (2014)

    Google Scholar 

  12. Wu, T., Yan, G.Z., Yang, B.H., et al.: EEG feature extraction based on wavelet packet decomposition for brain computer interface. Measurement 41(6), 618–625 (2008)

    Article  Google Scholar 

  13. Chen, X., Wu, J., Cai, Z.: Learning the attribute selection measures for decision tree. In: Fifth international conference on machine vision (ICMV 2012): algorithms, pattern recognition, and basic technologies, 8784(2), 257–259 (2013)

    Google Scholar 

  14. Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 International Joint Conference on Neural Networks (IJCNN’2004), (Budapest, Hungary), July 25–29 (2004)

    Google Scholar 

  15. Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybernet. Part B: Cybernet. 42(2), 513–529 (2012). (This paper shows that ELM generally outperforms SVM/LS-SVM in various kinds of cases.)

    Article  Google Scholar 

Download references

Acknowledgements

This research is partially sponsored by Natural Science Foundation of China (Nos. 61175115, 61370113 and 61272320), Beijing Municipal Natural Science Foundation (4152005 and 4152006), the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions (CIT&TCD201304035), Jing-Hua Talents Project of Beijing University of Technology (2014-JH-L06), Ri-Xin Talents Project of Beijing University of Technology (2014-RX-L06), the Research Fund of Beijing Municipal Commission of Education (PXM2015_014204_500221) and the International Communication Ability Development Plan for Young Teachers of Beijing University of Technology (No. 2014-16).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhen Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Duan, L., Ge, H., Yang, Z., Chen, J. (2016). Multimodal Fusion Using Kernel-Based ELM for Video Emotion Recognition. In: Cao, J., Mao, K., Wu, J., Lendasse, A. (eds) Proceedings of ELM-2015 Volume 1. Proceedings in Adaptation, Learning and Optimization, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-28397-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28397-5_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28396-8

  • Online ISBN: 978-3-319-28397-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics