A Unified Framework for Monocular Video-Based Facial Motion Tracking and Expression Recognition

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10133)


This paper proposes a unified facial motion tracking and expression recognition framework for monocular video. For retrieving facial motion, an online weight adaptive statistical appearance method is embedded into the particle filtering strategy by using a deformable facial mesh model served as an intermediate to bring input images into correspondence by means of registration and deformation. For recognizing facial expression, facial animation and facial expression are estimated sequentially for fast and efficient applications, in which facial expression is recognized by static anatomical facial expression knowledge. In addition, facial animation and facial expression are simultaneously estimated for robust and precise applications, in which facial expression is recognized by fusing static and dynamic facial expression knowledge. Experiments demonstrate the high tracking robustness and accuracy as well as the high facial expression recognition score of the proposed framework.


Facial motion tracking Facial expression recognition 



This work is supported by the National Natural Science Foundation of China (No. 61572450, No. 61303150), the Open Project Program of the State KeyLab of CAD&CG, Zhejiang University (No. A1501), the Fundamental Research Funds for the Central Universities (WK2350000002), the Open Funding Project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (No. BUAA-VR-16KF-12), the Open Funding Project of State Key Laboratory of Novel Software Technology, Nanjing University (No. KFKT2016B08).


  1. 1.
    Black, M.J., et al.: Recognizing facial expressions in image sequences using local parameterized models of image motion. IJCV 25(1), 23–28 (1997)CrossRefGoogle Scholar
  2. 2.
    Gokturk, S., et al.: A data-driven model for monocular face tracking. In: ICCV, pp. 701–708 (2001)Google Scholar
  3. 3.
    Sung, J., Kanade, T., Kim, D.: Pose robust face tracking by combining active appearance models and cylinder head models. IJCV 80(2), 260–274 (2008)CrossRefGoogle Scholar
  4. 4.
    Dornaika, F., Davoine, F.: On appearance based face and facial action tracking. TCSVT 16(9), 1107–1124 (2006)Google Scholar
  5. 5.
    Zeng, Z.H., et al.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. TPAMI 31(1), 31–58 (2009)Google Scholar
  6. 6.
    Wen, Z., Huang, T.S.: Capturing subtle facial motions in 3D face tracking. In: ICCV, pp. 1343–1350 (2003)Google Scholar
  7. 7.
    Sandbach, G., et al.: Static and dynamic 3D facial expression recognition: a comprehensive survey. IVS 30(10), 683–697 (2012)Google Scholar
  8. 8.
    Fang, T., Zhao, X., et al.: 3D facial expression recognition: a perspective on promises and challenges. In: ICAFGR, pp. 603–610 (2011)Google Scholar
  9. 9.
    Marks, T.K., et al.: Tracking motion, deformation and texture using conditionally Gaussian processes. TPAMI 32(2), 348–363 (2010)CrossRefGoogle Scholar
  10. 10.
    Zhang, W., Wang, Q., Tang, X.: Real time feature based 3-D deformable face tracking. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 720–732. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_53 CrossRefGoogle Scholar
  11. 11.
    Liao, W.-K., Fidaleo, D., Medioni, G.: Integrating multiple visual cues for robust real-time 3D face tracking. In: Zhou, S.Kevin, Zhao, W., Tang, X., Gong, S. (eds.) AMFG 2007. LNCS, vol. 4778, pp. 109–123. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-75690-3_9 CrossRefGoogle Scholar
  12. 12.
    Cascia, M.L., et al.: Fast, reliable head tracking under varying illumination: an approach based on registration of texture mapped 3D models. TPAMI 22(4), 322–336 (2000)CrossRefGoogle Scholar
  13. 13.
    Fidaleo, D., Medioni, G., Fua, P., Lepetit, V.: An investigation of model bias in 3D face tracking. In: Zhao, W., Gong, S., Tang, X. (eds.) AMFG 2005. LNCS, vol. 3723, pp. 125–139. Springer, Heidelberg (2005). doi: 10.1007/11564386_11 CrossRefGoogle Scholar
  14. 14.
    Liao, W.K., et al.: 3D face tracking and expression inference from a 2D sequence using manifold learning. In: CVPR, pp. 3597–3604 (2008)Google Scholar
  15. 15.
    Cao, C., Lin, Y., Lin, W.S., Zhou, K.: 3D shape regression for real-time facial animation. TOG 32(4), 149–158 (2013)CrossRefzbMATHGoogle Scholar
  16. 16.
    Cao, C., et al.: Displaced dynamic expression regression for real-time facial tracking and animation. In: SIGGRAPH, pp. 796–812 (2014)Google Scholar
  17. 17.
    Jepson, A.D., Fleet, D.J., et al.: Robust online appearance models for visual tracking. TPAMI 25(10), 1296–1311 (2003)CrossRefGoogle Scholar
  18. 18.
    Lui, Y.M., et al.: Adaptive appearance model and condensation algorithm for robust face tracking. TSMC Part A 40(3), 437–448 (2010)Google Scholar
  19. 19.
    Yu, J., Wang, Z.F.: A video, text and speech-driven realistic 3-D virtual head for human-machine interface. IEEE Trans. Cybern. 45(5), 977–988 (2015)Google Scholar
  20. 20.
    Wang, Y., et al.: Realtime facial expression recognition with Adaboost. In: ICPR, pp. 30–34 (2004)Google Scholar
  21. 21.
    Bartlett, M., Littlewort, G., Lainscsek, C.: Machine learning methods for fully automatic recognition of facial expressions and facial actions. In: ICSMC, pp. 145–152 (2004)Google Scholar
  22. 22.
    Zhang, Y., Ji, Q.: Active and dynamic information fusion for facial expression understanding from image sequences. TPAMI 27(5), 699–714 (2005)CrossRefGoogle Scholar
  23. 23.
    Tian, Y.L., et al.: Facial expression analysis. In: Li, S.Z., Jain, A.K. (eds.) Handbook of Face Recognition. Springer, New York (2005)Google Scholar
  24. 24.
    Chang, Y., et al.: Probabilistic expression analysis on manifolds. In: CVPR, pp. 520–527 (2004)Google Scholar
  25. 25.
    Lucey, P., et al.: The extended Cohn-Kande dataset (CK+): a complete facial expression dataset for action unit and emotion-specified expression. In: CVPR, pp. 217–224 (2010)Google Scholar
  26. 26.
    Tian, Y., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. TPAMI 23, 97–115 (2001)CrossRefGoogle Scholar
  27. 27.
    Hamester, D., et al.: Face expression recognition with a 2-channel convolutional neural network. In: IJCNN, pp. 12–17 (2015)Google Scholar
  28. 28.
    Wang, H., Ahuja, N.: Facial expression decomposition. In: ICCV, pp. 958–963 (2003)Google Scholar
  29. 29.
    Lee, C., Elgammal, A.: Facial expression analysis using nonlinear decomposable generative models. In: IWAMFG, pp. 958–963 (2005)Google Scholar
  30. 30.
    Zhu, Z., Ji, Q.: Robust realtime face pose and facial expression recovery. In: CVPR, pp. 1–8 (2006)Google Scholar
  31. 31.
    Cohen, L., Sebe, N., et al.: Facial expression recognition from video sequences: temporal and static modeling. CVIU 91(1–2), 160–187 (2003)Google Scholar
  32. 32.
    North, B., Blake, A., et al.: Learning and classification of complex dynamics. TPAMI 22(9), 1016–1034 (2000)CrossRefGoogle Scholar
  33. 33.
    Zhou, S., Krueger, V., Chellappa, R.: Probabilistic recognition of human faces from video. CVIU 91(1–2), 214–245 (2003)Google Scholar
  34. 34.
    A Video-Based Facial Motion Tracking and Expression Recognition System. Multimed. Tools and Appl. (2016). doi: 10.1007/s11042-016-3883-3
  35. 35.
    Ekman, P., Friesen, W., et al.: Facial Action Coding System: Research Nexus. Network Research Information, Salt Lake City (2002)Google Scholar
  36. 36.
    Schmidt, K., Cohn, J.: Dynamics of facial expression: normative characteristics and individual differences. In: ICME, pp. 728–731 (2001)Google Scholar
  37. 37.
    Dornaika, F., Davoine, F.: Simultaneous facial action tracking and expression recognition in the presence of head motion. IJCV 76(3), 257–281 (2008)CrossRefGoogle Scholar
  38. 38.
    Hu, Y.K., Wang, Z.F.: A low-dimensional illumination space representation of human faces for arbitrary lighting conditions. Acta Automatica Sinica 33(1), 9–14 (2007)CrossRefGoogle Scholar
  39. 39.
  40. 40.
    Nordstrøm, M.M., et al.: The IMM face database - an annotated dataset of 240 face images. Technical report, Technical University of Denmark (2004)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of AutomationUniversity of Science and Technology of ChinaHefeiChina

Personalised recommendations