Emotion recognition in the wild via sparse transductive transfer linear discriminant analysis

Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Yan, Keyu; Yan, Jingwei; Zhang, Tong

doi:10.1007/s12193-015-0210-7

Emotion recognition in the wild via sparse transductive transfer linear discriminant analysis

Original Paper
Published: 05 January 2016

Volume 10, pages 163–172, (2016)
Cite this article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Yuan Zong¹,
Wenming Zheng¹,
Xiaohua Huang²,
Keyu Yan¹,
Jingwei Yan¹ &
…
Tong Zhang¹

986 Accesses
23 Citations
Explore all metrics

Abstract

Recently, emotion recognition in the wild has been attracted in computer vision and affective computing. In contrast to classical emotion recognition, emotion recognition in the wild becomes more challenging since the databases are collected under real scenarios. In such databases, there would inevitably be various adverse samples, whose emotion labels are considerably hard to be identified using many ideal databases based classical emotion recognition methods. Therefore, it significantly increases the difficulty of emotion recognition task based on the wild databases. In this paper, we propose to use a transductive transfer learning framework to handle the problem of emotion recognition in the wild. We develop a sparse transductive transfer linear discriminant analysis (STTLDA) for facial expression recognition and speech emotion recognition under real-world environments, respectively. As far as we know, the novelty of our method is that we are the first to consider emotion recognition in the wild as a transfer learning problem and use the transductive transfer learning method to eliminate the distribution difference between training and testing samples caused by the “wild”. We conduct extensive experiments on SFEW 2.0, AFEW 4.0 and 5.0 (audio part) databases, which were used in Emotion Recognition in the Wild Challenge (EmotiW 2014 and 2015) to evaluate our proposed method. Experimental results demonstrate that our proposed STTLDA achieves a satisfactory performance compared with the baseline provided by the challenge organizers and some competitive methods. In addition, we report our previous results in static image based facial expression recognition challenge of EmotiW 2015. In this competition, we achieve an accuracy of 50 % on the Test set and this result has a 10.87 % improvement compared with the baseline released by challenge organizers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Facial emotion recognition using convolutional neural networks (FERC)

Article 18 February 2020

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Article Open access 06 May 2024

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

Article Open access 07 March 2023

References

Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syste Technol (TIST) 2(3):27
Google Scholar
Chen J, Chen Z, Chi Z, Fu H (2014) Emotion recognition in the wild with feature fusion and multiple kernel learning. In: Proceedings of the 16th international conference on multimodal interaction. ACM, pp 508–513
Dhall A, Goecke R, Joshi J, Sikka K, Gedeon T (2014) Emotion recognition in the wild challenge 2014: Baseline, data and protocol. In: Proceedings of the 16th international conference on multimodal interaction. ACM, pp 461–466
Dhall A, Goecke R, Joshi J, Wagner M, Gedeon T (2013) Emotion recognition in the wild challenge 2013. In: Proceedings of the 15th ACM on international conference on multimodal interaction. ACM, pp 509–516
Dhall A, Goecke R, Lucey S, Gedeon T (2011) Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, pp 2106–2112
Dhall A, Goecke R, Lucey S, Gedeon T (2012) Collecting large, richly annotated facial-expression databases from movies. IEEE MultiMed 19(3):34–41
Article Google Scholar
Dhall A, Murthy OR, Goecke R, Joshi J, Gedeon T (2015) Video and image based emotion recognition challenges in the wild: emotiw 2015. In: Proceedings of the 2015 ACM on international conference on multimodal interaction. ACM, pp 423–426
El Ayadi M, Kamel MS, Karray F (2011) Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn 44(3):572–587
Article MATH Google Scholar
Eyben F, Wöllmer M, Schuller B (2010) Opensmile: the munich versatile and fast open-source audio feature extractor. In: Proceedings of the international conference on multimedia. ACM, pp 1459–1462
Huang X, Dhall A, Zhao G, Goecke R, Pietikäinen M (2015) Riesz-based volume local binary pattern and a novel group expression model for group happiness intensity analysis. In: BMVC, pp 1–13
Huang X, Zhao G, Hong X, Pietikäinen M (2013) Texture description with completed local quantized patterns. In: Proceedings of scandinavian conference on image analysis. Springer, pp 1–10
Huang X, Zhao G, Zheng W, Pietikäinen M (2012) Spatiotemporal local monogenic binary patterns for facial expression recognition. IEEE Signal Process Lett 19(5):243–246
Article Google Scholar
Kahou SE, Pal C, Bouthillier X, Froumenty P, Gülçehre Ç, Memisevic R, Vincent P, Courville A, Bengio Y, Ferrari RC et al (2013) Combining modality specific deep neural networks for emotion recognition in video. In: Proceedings of the 15th ACM on international conference on multimodal interaction. ACM, pp 543–550
Liu G, Lin Z, Yan S, Sun J, Yu Y, Ma Y (2013) Robust recovery of subspace structures by low-rank representation. IEEE Trans Pattern Anal Mach Intell 35(1):171–184
Article Google Scholar
Liu M, Wang R, Huang Z, Shan S, Chen X (2013) Partial least squares regression on grassmannian manifold for emotion recognition. In: Proceedings of the 15th ACM on international conference on multimodal interaction. ACM, pp 525–530
Liu M, Wang R, Li S, Shan S, Huang Z, Chen X (2014) Combining multiple kernel methods on riemannian manifold for emotion recognition in the wild. In: Proceedings of the 16th international conference on multimodal interaction. ACM, pp 494–501
Lucey P, Cohn JF, Kanade T, Saragih J, Ambadar Z, Matthews I (2010) The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE computer society conference on computer vision and pattern recognition workshops (CVPRW). IEEE, pp 94–101
Lyons M, Akamatsu S, Kamachi M, Gyoba J (1998) Coding facial expressions with gabor wavelets. In: Proceedings of 3rd IEEE international conference on automatic face and gesture recognition. IEEE, pp 200–205
Ojala T, Pietikäinen M, Mäenpää T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary pattern. IEEE Trans Pattern Anal Mach Intell 24(7):971–987
Article MATH Google Scholar
Ojansivu V, Heikkilä J (2008) Blur insensitive texture classification using local phase quantization. In: International conference on image and signal processing, pp 236–243
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Schuller B, Steidl S, Batliner A (2009) The interspeech 2009 emotion challenge. In: INTERSPEECH, vol 2009, pp 312–315
Schuller B, Steidl S, Batliner A, Burkhardt F, Devillers L, Müller CA, Narayanan SS (2010) The interspeech 2010 paralinguistic challenge. In: INTERSPEECH, pp 2794–2797
Schuller B, Valstar M, Eyben F, McKeown G, Cowie R, Pantic M (2011) Avec 2011–the first international audio/visual emotion challenge. In: Affective computing and intelligent interaction. Springer, pp 415–424
Schuller B, Valster M, Eyben F, Cowie R, Pantic M (2012) Avec 2012: the continuous audio/visual emotion challenge. In: Proceedings of the 14th ACM international conference on Multimodal interaction. ACM, pp 449–456
Shan C, Gong S, McOwan PW (2009) Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis Comput 27(6):803–816
Article Google Scholar
Sikka K, Dykstra K, Sathyanarayana S, Littlewort G, Bartlett M (2013) Multiple kernel learning for emotion recognition in the wild. In: Proceedings of the 15th ACM on international conference on multimodal interaction. ACM, pp 517–524
Sun B, Li L, Zuo T, Chen Y, Zhou G, Wu X (2014) Combining multimodal features with hierarchical classifier fusion for emotion recognition in the wild. In: Proceedings of the 16th international conference on multimodal interaction. ACM, pp 481–486
Tian Y, Kanade T, Cohn JF (2011) Facial expression recognition. In: Handbook of face recognition. Springer, pp 487–519
Valstar M, Schuller B, Smith K, Eyben F, Jiang B, Bilakhia S, Schnieder S, Cowie R, Pantic M (2013) Avec 2013: the continuous audio/visual emotion and depression recognition challenge. In: 3rd ACM international workshop on Audio/visual emotion challenge. ACM, pp 3–10
Zeng Z, Pantic M, Roisman G, Huang TS et al (2009) A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans Pattern Anal Mach Intell 31(1):39–58
Zheng W (2014) Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans Affect Comput 5(1):71–85
Zheng W, Xin M, Wang X, Wang B (2014) A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Process Lett 21(5):569–572
Article Google Scholar
Zheng W, Zhou X (2015) Cross-pose color facial expression recognition using transductive transfer linear discriminant analysis. In: Proceedings of the IEEE international conference on image processing. IEEE, pp 1935–1939
Zong Y, Zheng W, Huang X, Yan J, Zhang T (2015) Transductive transfer lda with riesz-based volume lbp for emotion recognition in the wild. In: Proceedings of the 2015 ACM on international conference on multimodal interaction. ACM, pp 491–496

Download references

Acknowledgments

The authors would like to thank anonymous reviewers for their useful comments and valuable suggestions.

Author information

Authors and Affiliations

Key Laboratory of Child Development and Learning Science of Ministry of Education, Research Center for Learning Science, Southeast University, Nanjing, 210096, People’s Republic of China
Yuan Zong, Wenming Zheng, Keyu Yan, Jingwei Yan & Tong Zhang
Center for Machine Vision and Signal Analysis, Department of Computer Science and Engineering, University of Oulu, Oulu, FI-90014, Finland
Xiaohua Huang

Authors

Yuan Zong
View author publications
You can also search for this author in PubMed Google Scholar
Wenming Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Keyu Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jingwei Yan
View author publications
You can also search for this author in PubMed Google Scholar
Tong Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenming Zheng.

Additional information

This work was partly supported by the National Basic Research Program of China under Grant 2015CB351704 and 2011CB302202, the National Natural Science Foundation of China (NSFC) under Grants 61231002 and 61201444, the Ph.D. Program Foundation of Ministry Education of China under Grant 20120092110054, the Natural Science Foundation of Jiangsu Province under Grant BK20130020, and the Graduate Research Innovation Project of Jiangsu Province under Grant KYZZ15_0055.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zong, Y., Zheng, W., Huang, X. et al. Emotion recognition in the wild via sparse transductive transfer linear discriminant analysis. J Multimodal User Interfaces 10, 163–172 (2016). https://doi.org/10.1007/s12193-015-0210-7

Download citation

Received: 29 October 2015
Accepted: 17 December 2015
Published: 05 January 2016
Issue Date: June 2016
DOI: https://doi.org/10.1007/s12193-015-0210-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Emotion recognition in the wild via sparse transductive transfer linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Emotion recognition in the wild via sparse transductive transfer linear discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Facial emotion recognition using convolutional neural networks (FERC)

Multimodal emotion classification using machine learning in immersive and non-immersive virtual reality

Real-time facial emotion recognition system among children with autism based on deep learning and IoT

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation