Photorealistic Facial Synthesis in the Dimensional Affect Space

  • Dimitrios KolliasEmail author
  • Shiyang Cheng
  • Maja Pantic
  • Stefanos Zafeiriou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11130)


This paper presents a novel approach for synthesizing facial affect, which is based on our annotating 600,000 frames of the 4DFAB database in terms of valence and arousal. The input of this approach is a pair of these emotional state descriptors and a neutral 2D image of a person to whom the corresponding affect will be synthesized. Given this target pair, a set of 3D facial meshes is selected, which is used to build a blendshape model and generate the new facial affect. To synthesize the affect on the 2D neutral image, 3DMM fitting is performed and the reconstructed face is deformed to generate the target facial expressions. Last, the new face is rendered into the original image. Both qualitative and quantitative experimental studies illustrate the generation of realistic images, when the neutral image is sampled from a variety of well known databases, such as the Aff-Wild, AFEW, Multi-PIE, AFEW-VA, BU-3DFE, Bosphorus.


Dimensional facial affect synthesis Valence Arousal Discretization Blendshape models 3DMM fitting 4DFAB Aff-Wild AFEW AFEW-VA Multi-PIE BU-3DFE Bosphorus Deep neural networks 



The authors would also like to thank Evangelos Ververas for assisting with the affect synthesis. The work of Dimitris Kollias was funded by a Teaching Fellowship of Imperial College London. The work of S. Cheng is funded by the EPSRC project EP/J017787/1 (4D-FAB) and EP/N007743/1 (FACER2VM).


  1. 1.
    Bach, F., Jenatton, R., Mairal, J., Obozinski, G.: Optimization withsparsity-inducing penalties. Found. Trends Mach. Learn. 4(1), 1–106 (2012). Scholar
  2. 2.
    Basak, D., Pal, S., Patranabis, D.C.: Support vector regression. Neural Inf. Process.-Lett. Rev. 11(10), 203–224 (2007)Google Scholar
  3. 3.
    Blanz, V., Basso, C., Poggio, T., Vetter, T.: Reanimating faces in images and video. In: Computer Graphics Forum, vol. 22, pp. 641–650. Wiley Online Library (2003)Google Scholar
  4. 4.
    Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., Zafeiriou, S.: 3D face morphable models “in-the-wild”. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
  5. 5.
    Booth, J., Roussos, A., Ponniah, A., Dunaway, D., Zafeiriou, S.: Large scale 3D morphable models. Int. J. Comput. Vis. 126(2–4), 233–254 (2018)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Cao, Q., Shen, L., Xie, W., Parkhi, O.M., Zisserman, A.: Vggface2: a dataset for recognising faces across pose and age. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 67–74. IEEE (2018)Google Scholar
  7. 7.
    Chang, W.Y., Hsu, S.H., Chien, J.H.: Fatauva-net: an integrated deep learning framework for facial attribute recognition, action unit (au) detection, and valence-arousal estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop (2017)Google Scholar
  8. 8.
    Cheng, S., Kotsia, I., Pantic, M., Zafeiriou, S.: 4DFAB: a large scale 4D database for facial expression analysis and biometric applications. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, Utah, US, June 2018Google Scholar
  9. 9.
    Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., Gedeon, T.: From individual to group-level emotion recognition: Emotiw 5.0. In: Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 524–528. ACM (2017)Google Scholar
  10. 10.
    Ding, H., Sricharan, K., Chellappa, R.: Exprgan: Facial expression editing with controllable expression intensity. arXiv preprint arXiv:1709.03842 (2017)
  11. 11.
    Du, S., Tao, Y., Martinez, A.: Compound facial expressions of emotion. Proc. Natl. Acad. Sci. U.S.A. 111(15), 1454–1462 (2014). Scholar
  12. 12.
    Ghodrati, A., Jia, X., Pedersoli, M., Tuytelaars, T.: Towards automatic image editing: Learning to see another you. arXiv preprint arXiv:1511.08446 (2015)
  13. 13.
    Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)Google Scholar
  14. 14.
    Gross, R., Matthews, I., Cohn, J., Kanade, T., Baker, S.: Multi-pie. Image Vis. Comput. 28(5), 807–813 (2010)CrossRefGoogle Scholar
  15. 15.
    Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis; an overview with application to learning methods. Technical report, Royal Holloway, University of London, May 2003.
  16. 16.
    Huang, R., Zhang, S., Li, T., He, R., et al.: Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis. arXiv preprint arXiv:1704.04086 (2017)
  17. 17.
    Jonze, S., Cusack, J., Diaz, C., Keener, C., Kaufman, C.: Being John Malkovich. Universal Studios (1999)Google Scholar
  18. 18.
    Kollias, D., Nicolaou, M.A., Kotsia, I., Zhao, G., Zafeiriou, S.: Recognition of affect in the wild using deep neural networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1972–1979. IEEE (2017)Google Scholar
  19. 19.
    Kollias, D., Tagaris, A., Stafylopatis, A.: On line emotion detection using retrainable deep neural networks. In: 2016 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2016)Google Scholar
  20. 20.
    Kollias, D., et al.: Deep affect prediction in-the-wild: Aff-wild database and challenge, deep architectures, and beyond. arXiv preprint arXiv:1804.10938 (2018)
  21. 21.
    Kollias, D., Yu, M., Tagaris, A., Leontidis, G., Stafylopatis, A., Kollias, S.: Adaptation and contextualization of deep neural network models. In: 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1–8. IEEE (2017)Google Scholar
  22. 22.
    Kollias, D., Zafeiriou, S.: Training deep neural networks with different datasets in-the-wild: the emotion recognition paradigm. In: 2018 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2018)Google Scholar
  23. 23.
    Kossaifi, J., Tzimiropoulos, G., Todorovic, S., Pantic, M.: AFEW-VA database for valence and arousal estimation in-the-wild. Image Vis. Comput. 65, 23–36 (2017)CrossRefGoogle Scholar
  24. 24.
    Larsen, A.B.L., Sønderby, S.K., Larochelle, H., Winther, O.: Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300 (2015)
  25. 25.
    Makhzani, A., Shlens, J., Jaitly, N., Goodfellow, I., Frey, B.: Adversarial autoencoders. arXiv preprint arXiv:1511.05644 (2015)
  26. 26.
    Mohammed, U., Prince, S.J., Kautz, J.: Visio-lization: generating novel facial images. ACM Trans. Graph. (TOG) 28(3), 57 (2009)CrossRefGoogle Scholar
  27. 27.
    Neumann, T., Varanasi, K., Wenger, S., Wacker, M., Magnor, M., Theobalt, C.: Sparse localized deformation components. ACM Trans. Graph. (TOG) 32(6), 179 (2013)CrossRefGoogle Scholar
  28. 28.
    Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: BMVC, vol. 1, p. 6 (2015)Google Scholar
  29. 29.
    Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. In: ACM SIGGRAPH 2003 Papers, SIGGRAPH 2003, pp. 313–318. ACM, New York (2003).
  30. 30.
    Pighin, F., Hecker, J., Lischinski, D., Szeliski, R., Salesin, D.H.: Synthesizing realistic facial expressions from photographs. In: ACM SIGGRAPH 2006 Courses, p. 19. ACM (2006)Google Scholar
  31. 31.
    Russell, J.A.: Evidence of convergent validity on the dimensions of affect. J. Pers. Soc. Psychol. 36(10), 1152 (1978)CrossRefGoogle Scholar
  32. 32.
    Savran, A., et al.: Bosphorus database for 3D face analysis. In: Schouten, B., Juul, N.C., Drygajlo, A., Tistarelli, M. (eds.) BioID 2008. LNCS, vol. 5372, pp. 47–56. Springer, Heidelberg (2008). Scholar
  33. 33.
    Sun, Y., Chen, Y., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in Neural Information Processing Systems, pp. 1988–1996 (2014)Google Scholar
  34. 34.
    Susskind, J.M., Hinton, G.E., Movellan, J.R., Anderson, A.K.: Generating facial expressions with deep belief nets. In: Affective Computing. InTech (2008)Google Scholar
  35. 35.
    Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1701–1708 (2014)Google Scholar
  36. 36.
    Theobald, B.J., et al.: Mapping and manipulating facial expression. Lang. Speech 52(2–3), 369–386 (2009)CrossRefGoogle Scholar
  37. 37.
    Thies, J., Zollhöfer, M., Nießner, M., Valgaerts, L., Stamminger, M., Theobalt, C.: Real-time expression transfer for facial reenactment. ACM Trans. Graph. 34(6), 183–1 (2015)CrossRefGoogle Scholar
  38. 38.
    Thomaz, C.E., Giraldi, G.A.: A new ranking method for principal components analysis and its application to face image analysis. Image Vis. Comput. 28(6), 902–913 (2010)CrossRefGoogle Scholar
  39. 39.
    Whissell, C.M.: The dictionary of affect in language. In: The Measurement of Emotions, pp. 113–131. Elsevier (1989)Google Scholar
  40. 40.
    Wright, S.J., Nowak, R.D., Figueiredo, M.A.T.: Sparse reconstruction by separable approximation. IEEE Trans. Sig. Process. 57(7), 2479–2493 (2009). Scholar
  41. 41.
    Yan, X., Yang, J., Sohn, K., Lee, H.: Attribute2Image: conditional image generation from visual attributes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 776–791. Springer, Cham (2016). Scholar
  42. 42.
    Yang, F., Bourdev, L., Shechtman, E., Wang, J., Metaxas, D.: Facial expression editing in video using a temporally-smooth factorization. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 861–868. IEEE (2012)Google Scholar
  43. 43.
    Yang, F., Wang, J., Shechtman, E., Bourdev, L., Metaxas, D.: Expression flow for 3D-aware face component transfer. ACM Trans. Graph. (TOG) 30(4), 60 (2011)CrossRefGoogle Scholar
  44. 44.
    Yeh, R., Liu, Z., Goldman, D.B., Agarwala, A.: Semantic facial expression editing using autoencoded flow. arXiv preprint arXiv:1611.09961 (2016)
  45. 45.
    Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: 7th International Conference on Automatic Face and Gesture Recognition, FGR 2006, pp. 211–216. IEEE (2006)Google Scholar
  46. 46.
    Zafeiriou, S., Kollias, D., Nicolaou, M.A., Papaioannou, A., Zhao, G., Kotsia, I.: Aff-wild: Valence and arousal ‘in-the-wild’ challenge. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1980–1987. IEEE (2017)Google Scholar
  47. 47.
    Zhang, Q., Liu, Z., Quo, G., Terzopoulos, D., Shum, H.Y.: Geometry-driven photorealistic facial expression synthesis. IEEE Trans. Vis. Comput. Graph. 12(1), 48–60 (2006)CrossRefGoogle Scholar
  48. 48.
    Zhang, S., Wu, Z., Meng, H.M., Cai, L.: Facial expression synthesis based on emotion dimensions for affective talking avatar. In: Nishida, T., Jain, L.C., Faucher, C. (eds.) Modeling Machine Emotions for Realizing Intelligence, pp. 109–132. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  49. 49.
    Zhang, Z., Song, Y., Qi, H.: Age progression/regression by conditional adversarial autoencoder. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), vol. 2 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Dimitrios Kollias
    • 1
    Email author
  • Shiyang Cheng
    • 1
  • Maja Pantic
    • 1
  • Stefanos Zafeiriou
    • 1
    • 2
  1. 1.Department of ComputingImperial College LondonLondonUK
  2. 2.Centre for Machine Vision and Signal AnalysisUniversity of OuluOuluFinland

Personalised recommendations