Abstract
Micro-expression is a spontaneous and uncontrollable way to convey emotions. It contains abundant psychological information, whose recognition has significant importance in various fields. In recent years, with the rapid development of computer vision, the research of facial expression tends to be more mature while the research of micro-expression remains a hot yet challenging topic. The main difficulties of recognizing micro-expression lay on the discriminative feature extraction process due to the extremely short-term and subtlety of micro-expression. To deal with this problem, this paper proposes a deep learning model to efficiently extract discriminative features. Our model is based on three VGGNets and one Long Short-Term Memory (LSTM). Three VGGNets are used to extract static and motive information where three types of attention mechanism are jointly integrated for more discriminative visual representations. Then, the spatial features of a micro-expression sequence are sequentially fed into an LSTM to extract spatio-temporal features and predict the micro-expression category. Our algorithm is carried out on the benchmark micro-expression dataset CASME II. Its efficiency is demonstrated by extensive ablation analysis and state-of-the-art algorithms.
Similar content being viewed by others
References
Bao W, Lai W, Ma C, Zhang X, Gao Z, Yang M (2019) Depth-aware video frame interpolation. CoRR 1904.00830
Borza D, Itu R, Danescu R (2017) Real-time micro-expression detection from high speed cameras. In: IEEE International conference on intelligent computer communication and processing (ICCP), pp 357–361
Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T (2017) SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6298–6306
Davison AK, Lansley C, Costen N, Tan K, Yap MH (2018) SAMM: a spontaneous micro-facial movement dataset. IEEE Trans Affect Comput 9(1):116–129. https://doi.org/10.1109/TAFFC.2016.2573832
Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K, Darrell T (2017) Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans Pattern Anal Mach Intell 39(4):677–691
Ekman P (2003) Darwin, deception, and facial expression. Ann N Y Acad Sci 1000(1):205–221
Ekman P, O’Sullivan M, Frank MG (1999) A few can catch a liar. Psychol Sci 10(3):263–266
Frank MG, Ekman P (1997) The ability to detect deceit generalizes across different types of high-stake lies. J Person Soc Psychol 72(6):1429
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7132–7141
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259. https://doi.org/10.1109/34.730558
Khor H, See J, Phan RCW, Lin W (2018) Enriched long-term recurrent convolutional network for facial micro-expression recognition. In: IEEE international conference on automatic face gesture recognition (FG), pp 667–674. https://doi.org/10.1109/FG.2018.00105
Kim DH, Baddar WJ, Yong MR (2016) Micro-expression recognition with expression-state constrained spatio-temporal feature representations. In: ACM on multimedia, pp 382–386
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Li X, Pfister T, Huang X, Zhao G, Pietikäinen M (2013) A spontaneous micro-expression database: inducement, collection and baseline. In: IEEE International conference and workshops on automatic face and gesture recognition (FG), pp 1–6
Li X, Yu J, Zhan S (2017) Spontaneous facial micro-expression detection based on deep learning. In: IEEE International conference on signal processing (ICSP), pp 1130–1134
Liu J, Yang X, Zhai G, Chen CW (2016) Visual saliency model based on minimum description length. In: IEEE international symposium on circuits and systems (ISCAS), pp 990–993. https://doi.org/10.1109/ISCAS.2016.7527409https://doi.org/10.1109/ISCAS.2016.7527409
Liu J, Liu P, Su Y, Jing P, Yang X (2019) Spatiotemporal symmetric convolutional neural network for video bit-depth enhancement. IEEE Trans Multimed, 1–1. https://doi.org/10.1109/TMM.2019.2897909
Liu J, Sun W, Su Y, Jing P, Yang X (2019) BE-CALF: bit-depth enhancement by concatenating all level features of DNN. accepted by IEEE Transactions on Image Processing, pp 1–1
Lu G, Ouyang W, Xu D, Zhang X, Gao Z, Sun MT (2018) Deep Kalman filtering network for video compression artifact reduction. In: European conference on computer vision (ECCV), pp 568–584
Ma S, Liu J, Chen CW (2017) A-lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 722–731. https://doi.org/10.1109/CVPR.2017.84
Matsumoto D, Hwang HS (2011) Evidence for training the ability to read microexpressions of emotion. Motiv Emot 35(2):181–191
Mayya V, Pai RM, Pai MMM (2016) Combining temporal interpolation and DCNN for faster recognition of micro-expressions in video sequences. In: International conference on advances in computing, communications and informatics, pp 699–703
Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Advances in neural information processing systems (NIPS). Curran Associates, Inc., pp 2204–2212
Park SY, Lee SH, Yong MR (2015) Subtle facial expression recognition using adaptive magnification of discriminative facial motion. In: ACM international conference on multimedia, pp 911–914
Patel D, Hong X, Zhao G (2017) Selective deep features for micro-expression recognition. In: International conference on pattern recognition (ICPR), pp 2258–2263
Peng M, Wu Z, Zhang Z, Chen T (2018) From macro to micro expression recognition: deep learning on small datasets using transfer learning. In: IEEE international conference on automatic face gesture recognition (FG), pp 657–661
Pfister T, Li X, Zhao G, Pietikäinen M (2011) Recognising spontaneous facial micro-expressions. In: IEEE international conference on computer vision (ICCV), pp 1449–1456
Polikovsky S, Kameda Y, Ohta Y (2009) Facial micro-expressions recognition using high speed camera and 3D-gradient descriptor. In: International conference on imaging for crime detection and prevention (ICDP), pp 1–6. https://doi.org/10.1049/ic.2009.0244
Qu F, Wang S, Yan W, Li H, Wu S, Fu X (2018) CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans Affect Comput 9(4):424–436
Sanchez Perez J, Meinhardt-Llopis E, Facciolo G (2013) TV-L1 optical flow estimation. Image Process Line 3:137–150
Shreve M, Godavarthy S, Goldgof D, Sarkar S (2011) Macro- and micro-expression spotting in long videos using spatio-temporal strain. In: IEEE international conference on automatic face gesture recognition (FG), pp 51–56
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556
Yan W, Wu Q, Liu Y, Wang S, Fu X (2013) CASME database: a dataset of spontaneous micro-expressions collected from neutralized faces. In: IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–7
Wang Y, See J, Phan RCW, Oh YH (2015) LBP with six intersection points: reducing redundant information in LBP-TOP for micro-expression recognition. In: Asian conference on computer vision (ACCV). Springer International Publishing, pp 525–537
Warren G, Schertler E, Bull P (2009) Detecting deception from emotional and unemotional cues. J Nonverbal Behav 33(1):59–69. https://doi.org/10.1007/s10919-008-0057-7
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning (ICML), pp 2048–2057
Xu F, Zhang J, Wang JZ (2017) Microexpression identification and categorization using a facial dynamics map. IEEE Trans Affect Comput 8(2):254–267
Xu N, Liu AA, Liu J, Nie W, Su Y (2019) Scene graph captioner: image captioning based on structural visual representation. J Vis Commun Image Represent 58:477–485. https://doi.org/10.1016/j.jvcir.2018.12.027https://doi.org/10.1016/j.jvcir.2018.12.027. http://www.sciencedirect.com/science/article/pii/S1047320318303535
Yan WJ, Wu Q, Liang J, Chen YH, Fu X (2013) How fast are the leaked facial expressions: the duration of micro-expressions. J Nonverbal Behav 37(4):217–230
Yan WJ, Li X, Wang SJ, Zhao G, Liu YJ, Chen YH, Fu X (2014) CASME II: an improved spontaneous micro-expression database and the baseline evaluation. Plos One 9(1):e86041
Yang B, Zhang X, Liu J, Chen L, Gao Z (2016) Principal components analysis-based visual saliency detection. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1936–1940
Yang Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 21–29. https://doi.org/10.1109/CVPR.2016.10
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision (ECCV), pp 818–833
Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7151–7160. https://doi.org/10.1109/CVPR.2018.00747
Acknowledgments
This work was partly supported by the China Postdoctoral Science Foundation under Grant 2018M630190 and partly supported by Beijing NOVA Program (Z181100006218041).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yang, B., Cheng, J., Yang, Y. et al. MERTA: micro-expression recognition with ternary attentions. Multimed Tools Appl 80, 1–16 (2021). https://doi.org/10.1007/s11042-019-07896-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-019-07896-4