Skip to main content
Log in

MERTA: micro-expression recognition with ternary attentions

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Micro-expression is a spontaneous and uncontrollable way to convey emotions. It contains abundant psychological information, whose recognition has significant importance in various fields. In recent years, with the rapid development of computer vision, the research of facial expression tends to be more mature while the research of micro-expression remains a hot yet challenging topic. The main difficulties of recognizing micro-expression lay on the discriminative feature extraction process due to the extremely short-term and subtlety of micro-expression. To deal with this problem, this paper proposes a deep learning model to efficiently extract discriminative features. Our model is based on three VGGNets and one Long Short-Term Memory (LSTM). Three VGGNets are used to extract static and motive information where three types of attention mechanism are jointly integrated for more discriminative visual representations. Then, the spatial features of a micro-expression sequence are sequentially fed into an LSTM to extract spatio-temporal features and predict the micro-expression category. Our algorithm is carried out on the benchmark micro-expression dataset CASME II. Its efficiency is demonstrated by extensive ablation analysis and state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Bao W, Lai W, Ma C, Zhang X, Gao Z, Yang M (2019) Depth-aware video frame interpolation. CoRR 1904.00830

  2. Borza D, Itu R, Danescu R (2017) Real-time micro-expression detection from high speed cameras. In: IEEE International conference on intelligent computer communication and processing (ICCP), pp 357–361

  3. Chen L, Zhang H, Xiao J, Nie L, Shao J, Liu W, Chua T (2017) SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 6298–6306

  4. Davison AK, Lansley C, Costen N, Tan K, Yap MH (2018) SAMM: a spontaneous micro-facial movement dataset. IEEE Trans Affect Comput 9(1):116–129. https://doi.org/10.1109/TAFFC.2016.2573832

    Article  Google Scholar 

  5. Donahue J, Hendricks LA, Rohrbach M, Venugopalan S, Guadarrama S, Saenko K, Darrell T (2017) Long-term recurrent convolutional networks for visual recognition and description. IEEE Trans Pattern Anal Mach Intell 39(4):677–691

    Article  Google Scholar 

  6. Ekman P (2003) Darwin, deception, and facial expression. Ann N Y Acad Sci 1000(1):205–221

    Article  Google Scholar 

  7. Ekman P, O’Sullivan M, Frank MG (1999) A few can catch a liar. Psychol Sci 10(3):263–266

    Article  Google Scholar 

  8. Frank MG, Ekman P (1997) The ability to detect deceit generalizes across different types of high-stake lies. J Person Soc Psychol 72(6):1429

    Article  Google Scholar 

  9. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  10. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7132–7141

  11. Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259. https://doi.org/10.1109/34.730558

    Article  Google Scholar 

  12. Khor H, See J, Phan RCW, Lin W (2018) Enriched long-term recurrent convolutional network for facial micro-expression recognition. In: IEEE international conference on automatic face gesture recognition (FG), pp 667–674. https://doi.org/10.1109/FG.2018.00105

  13. Kim DH, Baddar WJ, Yong MR (2016) Micro-expression recognition with expression-state constrained spatio-temporal feature representations. In: ACM on multimedia, pp 382–386

  14. King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758

    Google Scholar 

  15. Li X, Pfister T, Huang X, Zhao G, Pietikäinen M (2013) A spontaneous micro-expression database: inducement, collection and baseline. In: IEEE International conference and workshops on automatic face and gesture recognition (FG), pp 1–6

  16. Li X, Yu J, Zhan S (2017) Spontaneous facial micro-expression detection based on deep learning. In: IEEE International conference on signal processing (ICSP), pp 1130–1134

  17. Liu J, Yang X, Zhai G, Chen CW (2016) Visual saliency model based on minimum description length. In: IEEE international symposium on circuits and systems (ISCAS), pp 990–993. https://doi.org/10.1109/ISCAS.2016.7527409https://doi.org/10.1109/ISCAS.2016.7527409

  18. Liu J, Liu P, Su Y, Jing P, Yang X (2019) Spatiotemporal symmetric convolutional neural network for video bit-depth enhancement. IEEE Trans Multimed, 1–1. https://doi.org/10.1109/TMM.2019.2897909

  19. Liu J, Sun W, Su Y, Jing P, Yang X (2019) BE-CALF: bit-depth enhancement by concatenating all level features of DNN. accepted by IEEE Transactions on Image Processing, pp 1–1

  20. Lu G, Ouyang W, Xu D, Zhang X, Gao Z, Sun MT (2018) Deep Kalman filtering network for video compression artifact reduction. In: European conference on computer vision (ECCV), pp 568–584

  21. Ma S, Liu J, Chen CW (2017) A-lamp: adaptive layout-aware multi-patch deep convolutional neural network for photo aesthetic assessment. In: IEEE Conference on computer vision and pattern recognition (CVPR), pp 722–731. https://doi.org/10.1109/CVPR.2017.84

  22. Matsumoto D, Hwang HS (2011) Evidence for training the ability to read microexpressions of emotion. Motiv Emot 35(2):181–191

    Article  Google Scholar 

  23. Mayya V, Pai RM, Pai MMM (2016) Combining temporal interpolation and DCNN for faster recognition of micro-expressions in video sequences. In: International conference on advances in computing, communications and informatics, pp 699–703

  24. Mnih V, Heess N, Graves A, Kavukcuoglu K (2014) Recurrent models of visual attention. In: Advances in neural information processing systems (NIPS). Curran Associates, Inc., pp 2204–2212

  25. Park SY, Lee SH, Yong MR (2015) Subtle facial expression recognition using adaptive magnification of discriminative facial motion. In: ACM international conference on multimedia, pp 911–914

  26. Patel D, Hong X, Zhao G (2017) Selective deep features for micro-expression recognition. In: International conference on pattern recognition (ICPR), pp 2258–2263

  27. Peng M, Wu Z, Zhang Z, Chen T (2018) From macro to micro expression recognition: deep learning on small datasets using transfer learning. In: IEEE international conference on automatic face gesture recognition (FG), pp 657–661

  28. Pfister T, Li X, Zhao G, Pietikäinen M (2011) Recognising spontaneous facial micro-expressions. In: IEEE international conference on computer vision (ICCV), pp 1449–1456

  29. Polikovsky S, Kameda Y, Ohta Y (2009) Facial micro-expressions recognition using high speed camera and 3D-gradient descriptor. In: International conference on imaging for crime detection and prevention (ICDP), pp 1–6. https://doi.org/10.1049/ic.2009.0244

  30. Qu F, Wang S, Yan W, Li H, Wu S, Fu X (2018) CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition. IEEE Trans Affect Comput 9(4):424–436

    Article  Google Scholar 

  31. Sanchez Perez J, Meinhardt-Llopis E, Facciolo G (2013) TV-L1 optical flow estimation. Image Process Line 3:137–150

    Article  Google Scholar 

  32. Shreve M, Godavarthy S, Goldgof D, Sarkar S (2011) Macro- and micro-expression spotting in long videos using spatio-temporal strain. In: IEEE international conference on automatic face gesture recognition (FG), pp 51–56

  33. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556

  34. Yan W, Wu Q, Liu Y, Wang S, Fu X (2013) CASME database: a dataset of spontaneous micro-expressions collected from neutralized faces. In: IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–7

  35. Wang Y, See J, Phan RCW, Oh YH (2015) LBP with six intersection points: reducing redundant information in LBP-TOP for micro-expression recognition. In: Asian conference on computer vision (ACCV). Springer International Publishing, pp 525–537

  36. Warren G, Schertler E, Bull P (2009) Detecting deception from emotional and unemotional cues. J Nonverbal Behav 33(1):59–69. https://doi.org/10.1007/s10919-008-0057-7

    Article  Google Scholar 

  37. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning (ICML), pp 2048–2057

  38. Xu F, Zhang J, Wang JZ (2017) Microexpression identification and categorization using a facial dynamics map. IEEE Trans Affect Comput 8(2):254–267

    Article  Google Scholar 

  39. Xu N, Liu AA, Liu J, Nie W, Su Y (2019) Scene graph captioner: image captioning based on structural visual representation. J Vis Commun Image Represent 58:477–485. https://doi.org/10.1016/j.jvcir.2018.12.027https://doi.org/10.1016/j.jvcir.2018.12.027. http://www.sciencedirect.com/science/article/pii/S1047320318303535

    Article  Google Scholar 

  40. Yan WJ, Wu Q, Liang J, Chen YH, Fu X (2013) How fast are the leaked facial expressions: the duration of micro-expressions. J Nonverbal Behav 37(4):217–230

    Article  Google Scholar 

  41. Yan WJ, Li X, Wang SJ, Zhao G, Liu YJ, Chen YH, Fu X (2014) CASME II: an improved spontaneous micro-expression database and the baseline evaluation. Plos One 9(1):e86041

    Article  Google Scholar 

  42. Yang B, Zhang X, Liu J, Chen L, Gao Z (2016) Principal components analysis-based visual saliency detection. In: IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 1936–1940

  43. Yang Z, He X, Gao J, Deng L, Smola A (2016) Stacked attention networks for image question answering. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 21–29. https://doi.org/10.1109/CVPR.2016.10

  44. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision (ECCV), pp 818–833

  45. Zhang H, Dana K, Shi J, Zhang Z, Wang X, Tyagi A, Agrawal A (2018) Context encoding for semantic segmentation. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 7151–7160. https://doi.org/10.1109/CVPR.2018.00747

Download references

Acknowledgments

This work was partly supported by the China Postdoctoral Science Foundation under Grant 2018M630190 and partly supported by Beijing NOVA Program (Z181100006218041).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bing Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, B., Cheng, J., Yang, Y. et al. MERTA: micro-expression recognition with ternary attentions. Multimed Tools Appl 80, 1–16 (2021). https://doi.org/10.1007/s11042-019-07896-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-07896-4

Keywords

Navigation