3D Convolutional Neural Networks for Facial Expression Classification

  • Wenyun Sun
  • Haitao Zhao
  • Zhong JinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10116)


In this paper, the general rules of designing 3D Convolutional Neural Networks are discussed. Four specific networks are designed for facial expression classification problem. Decisions of the four networks are fused together. The single networks and the ensemble network are evaluated on the extended Cohn-Kanade dataset, achieve accuracies of 92.31% and 96.15%. The performance outperform the state-of-the-art. A reusable open source project called 4DCNN is released. Based on this project, implementing 3D Convolutional Neural Networks for specific problems will be convenient.


Convolutional Neural Network Facial Expression Recognition Human Action Recognition Fiducial Point Facial Action Code System 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ekman, P., Friensen, E.: Facial Action Coding System (FACS): Manual. Consulting Psychologists Press, Palo Alto (1978)Google Scholar
  2. 2.
    Friensen, W., Ekman, P.: Emfacs-7: emotional facial action coding system. Technical report, University of California at San Francisico (1983)Google Scholar
  3. 3.
    Zeng, Z., Pantic, M., Roisman, G., Huang, T.S., et al.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 39–58 (2009)CrossRefGoogle Scholar
  4. 4.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23, 681–685 (2001)CrossRefGoogle Scholar
  5. 5.
    Matsugu, M., Mori, K., Mitari, Y., Kaneda, Y.: Subject independent facial expression recognition with robust face detection using a convolutional neural network. Neural Netw. 16, 555–559 (2003)CrossRefGoogle Scholar
  6. 6.
    Yang, P., Liu, Q., Metaxas, D.N.: Boosting encoded dynamic features for facial expression recognition. Pattern Recogn. Lett. 30, 132–139 (2009)CrossRefGoogle Scholar
  7. 7.
    Long, F., Wu, T., Movellan, J.R., Bartlett, M.S., Littlewort, G.: Learning spatiotemporal features by using independent component analysis with application to facial expression recognition. Neurocomputing 93, 126–132 (2012)CrossRefGoogle Scholar
  8. 8.
    Jeni, L., Girard, J.M., Cohn, J.F., De La Torre, F., et al.: Continuous AU intensity estimation using localized, sparse facial feature space. In: 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7. IEEE (2013)Google Scholar
  9. 9.
    Lorincz, A., Jeni, L., Szabo, Z., Cohn, J.F., Kanade, T., et al.: Emotional expression classification using time-series kernels. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 889–895. IEEE (2013)Google Scholar
  10. 10.
    Zheng, H.: Facial expression analysis. Technical report, School of Computer Science and Engineering, Southeast University, Nanjing, China (2014)Google Scholar
  11. 11.
    Sun, W., Jin, Z.: Facial expression classification based on convolutional neural networks. In: Advances in Face Image Analysis: Theory and Applications. Bentham Science Publishers, Sharjah (2015, in press)Google Scholar
  12. 12.
    Yun, T., Guan, L.: Human emotional state recognition using real 3D visual features from gabor library. Pattern Recogn. 46, 529–538 (2013)CrossRefGoogle Scholar
  13. 13.
    Dhall, A., et al.: Collecting large, richly annotated facial-expression databases from movies (2012)Google Scholar
  14. 14.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)CrossRefGoogle Scholar
  15. 15.
    Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35, 221–231 (2013)CrossRefGoogle Scholar
  16. 16.
    Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 1–42 (2014)MathSciNetGoogle Scholar
  17. 17.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  18. 18.
    Byeon, Y.H., Kwak, K.C.: Facial expression recognition using 3D convolutional neural network. Int. J. Adv. Comput. Sci. Appl. 5, 107–112 (2014)Google Scholar
  19. 19.
    Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., Lang, K.J.: Phoneme recognition using time-delay neural networks. IEEE Trans. Acoust. Speech Signal Process. 37, 328–339 (1989)CrossRefGoogle Scholar
  20. 20.
    Horn, B.K., Schunck, B.G.: Determining optical flow. In: 1981 Technical Symposium East, pp. 319–331. International Society for Optics and Photonics (1981)Google Scholar
  21. 21.
    Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 94–101. IEEE (2010)Google Scholar
  22. 22.
    Regianini, L.: Manual annotations of facial fiducial points on the cohn-kanade database (2015).
  23. 23.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  24. 24.
    Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. In: Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)Google Scholar
  25. 25.
    Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., Bengio, Y.: Theano: a CPU and GPU math expression compiler. In: Proceedings of the Python for Scientific Computing Conference (SciPy). Oral Presentation (2010)Google Scholar
  26. 26.
    Sun, W., Jin, Z.: The 2DCNN project (2015).
  27. 27.
    Sun, W., Jin, Z.: The 4DCNN project (2015).
  28. 28.
    Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: 8th IEEE International Conference On Automatic Face & Gesture Recognition, FG 2008, pp. 1–6. IEEE (2008)Google Scholar
  29. 29.
    Zhang, X., Yin, L., Cohn, J.F., Canavan, S., Reale, M., Horowitz, A., Liu, P., Girard, J.M.: BP4D-spontaneous: a high-resolution spontaneous 3D dynamic facial expression database. Image Vis. Comput. 32, 692–706 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.School of Computer Science and EngineeringNanjing University of Science and TechnologyNanjingChina
  2. 2.School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina

Personalised recommendations