Advertisement

Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network

  • Yao FengEmail author
  • Fan Wu
  • Xiaohu Shao
  • Yanfeng Wang
  • Xi Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11218)

Abstract

We propose a straightforward method that simultaneously reconstructs the 3D facial structure and provides dense alignment. To achieve this, we design a 2D representation called UV position map which records the 3D shape of a complete face in UV space, then train a simple Convolutional Neural Network to regress it from a single 2D image. We also integrate a weight mask into the loss function during training to improve the performance of the network. Our method does not rely on any prior face model, and can reconstruct full facial geometry along with semantic meaning. Meanwhile, our network is very light-weighted and spends only 9.8 ms to process an image, which is extremely faster than previous works. Experiments on multiple challenging datasets show that our method surpasses other state-of-the-art methods on both reconstruction and alignment tasks by a large margin. Code is available at https://github.com/YadiraF/PRNet.

Keywords

3D face reconstruction Dense face alignment 

Supplementary material

474202_1_En_33_MOESM1_ESM.pdf (125 kb)
Supplementary material 1 (pdf 124 KB)

References

  1. 1.
    Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Robust discriminative response map fitting with constrained local models. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3444–3451. IEEE (2013)Google Scholar
  2. 2.
    Bagdanov, A.D., Del Bimbo, A., Masi, I.: The florence 2D/3D hybrid face dataset. In: Proceedings of the 2011 Joint ACM Workshop on Human Gesture and Behavior Understanding, pp. 79–80. ACM (2011)Google Scholar
  3. 3.
    Bas, A., Huber, P., Smith, W.A.P., Awais, M., Kittler, J.: 3D morphable models as spatial transformer networks. In: ICCV 2017 Workshop on Geometry Meets Deep Learning (2017)Google Scholar
  4. 4.
    Bhagavatula, C., Zhu, C., Luu, K., Savvides, M.: Faster than real-time facial alignment: a 3D spatial transformer network approach in unconstrained poses. In: The IEEE International Conference on Computer Vision (ICCV), vol. 2, p. 7 (2017)Google Scholar
  5. 5.
    de Bittencourt Zavan, F.H., Nascimento, A.C.P., e Silva, L.P., Bellon, O.R.P., Silva, L.: 3D face alignment in the wild: a landmark-free, nose-based approach. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 581–589. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_40CrossRefGoogle Scholar
  6. 6.
    Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: International Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)Google Scholar
  7. 7.
    Booth, J., Zafeiriou, S.: Optimal UV spaces for facial morphable model construction. In: 2014 IEEE International Conference on Image Processing (ICIP), pp. 4672–4676. IEEE (2014)Google Scholar
  8. 8.
    Bulat, A., Tzimiropoulos, G.: Two-stage convolutional part heatmap regression for the 1st 3D face alignment in the wild (3DFAW) challenge. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 616–624. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_43CrossRefGoogle Scholar
  9. 9.
    Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2D and 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks) (2017)Google Scholar
  10. 10.
    Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. ACM (2014)Google Scholar
  11. 11.
    Chrysos, G.G., Antonakos, E., Zafeiriou, S., Snape, P.: Offline deformable face tracking in arbitrary videos. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 1–9 (2015)Google Scholar
  12. 12.
    Crispell, D., Bazik, M.: Pix2face: direct 3D face model estimation (2017)Google Scholar
  13. 13.
    Deng, J., Cheng, S., Xue, N., Zhou, Y., Zafeiriou, S.: UV-GAN: adversarial facial uv map completion for pose-invariant face recognition. arXiv preprint arXiv:1712.04695 (2017)
  14. 14.
    Dollár, P., Welinder, P., Perona, P.: Cascaded pose regression. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1078–1085. IEEE (2010)Google Scholar
  15. 15.
    Dou, P., Shah, S.K., Kakadiaris, I.A.: End-to-end 3D face reconstruction with deep neural networks (2017)Google Scholar
  16. 16.
    Fan, H., Su, H., Guibas, L.: A point set generation network for 3D object reconstruction from a single image, pp. 2463–2471 (2016)Google Scholar
  17. 17.
    Floater, M.S.: Parametrization and smooth approximation of surface triangulations. Comput. Aided Geom. Des. 14(3), 231–250 (1997)MathSciNetCrossRefGoogle Scholar
  18. 18.
    Gou, C., Wu, Y., Wang, F.-Y., Ji, Q.: Shape augmented regression for 3D face alignment. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 604–615. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_42CrossRefGoogle Scholar
  19. 19.
    Grewe, C.M., Zachow, S.: Fully automated and highly accurate dense correspondence for facial surfaces. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 552–568. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_38CrossRefGoogle Scholar
  20. 20.
    Gu, L., Kanade, T.: 3D alignment of face in a single image. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 1305–1312. IEEE (2006)Google Scholar
  21. 21.
    Gu, X., Gortler, S.J., Hoppe, H.: Geometry images. ACM Trans. Graph. (TOG) 21(3), 355–361 (2002)CrossRefGoogle Scholar
  22. 22.
    Güler, R.A., Trigeorgis, G., Antonakos, E., Snape, P., Zafeiriou, S., Kokkinos, I.: DenseReg: fully convolutional dense shape regression in-the-wild. In: Proceedings of the CVPR, vol. 2 (2017)Google Scholar
  23. 23.
    Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Kybernetes 30(9/10), 1865–1872 (2003)zbMATHGoogle Scholar
  24. 24.
    Hassner, T.: Viewing real-world faces in 3D. In: IEEE International Conference on Computer Vision, pp. 3607–3614 (2013)Google Scholar
  25. 25.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  26. 26.
    Huber, P., Feng, Z.H., Christmas, W., Kittler, J., Ratsch, M.: Fitting 3D morphable face models using local features. In: IEEE International Conference on Image Processing, pp. 1195–1199 (2015)Google Scholar
  27. 27.
    Huber, P., et al.: A multiresolution 3D morphable face model and fitting framework, pp. 79–86 (2016)Google Scholar
  28. 28.
    Jackson, A.S., Bulat, A., Argyriou, V., Tzimiropoulos, G.: Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 1031–1039. IEEE (2017)Google Scholar
  29. 29.
    Jeni, L.A., Cohn, J.F., Kanade, T.: Dense 3D face alignment from 2D videos in real-time. In: 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), vol. 1, pp. 1–8. IEEE (2015)Google Scholar
  30. 30.
    Jeni, L.A., Tulyakov, S., Yin, L., Sebe, N., Cohn, J.F.: The first 3D face alignment in the wild (3DFAW) challenge. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 511–520. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_35CrossRefGoogle Scholar
  31. 31.
    Jourabloo, A., Liu, X.: Pose-invariant 3D face alignment. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3694–3702 (2015)Google Scholar
  32. 32.
    Jourabloo, A., Liu, X.: Large-pose face alignment via CNN-based dense 3D model fitting. In: Computer Vision and Pattern Recognition (2016)Google Scholar
  33. 33.
    Kemelmacher-Shlizerman, I., Basri, R.: 3D face reconstruction from a single image using a single reference face shape. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 394 (2011)CrossRefGoogle Scholar
  34. 34.
    Kim, J., Liu, C., Sha, F., Grauman, K.: Deformable spatial pyramid matching for fast dense correspondences. In: Computer Vision and Pattern Recognition, pp. 2307–2314 (2013)Google Scholar
  35. 35.
    Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2144–2151. IEEE (2011)Google Scholar
  36. 36.
    Laine, S., Karras, T., Aila, T., Herva, A., Lehtinen, J.: Facial performance capture with deep neural networks. arXiv preprint arXiv:1609.06536 (2016)
  37. 37.
    Lee, Y.J., Lee, S.J., Kang, R.P., Jo, J., Kim, J.: Single view-based 3D face reconstruction robust to self-occlusion. EURASIP J. Adv. Signal Process. 2012(1), 1–20 (2012)CrossRefGoogle Scholar
  38. 38.
    Liang, Z., Ding, S., Lin, L.: Unconstrained facial landmark localization with backbone-branches fully-convolutional networks. arXiv preprint arXiv:1507.03409 (2015)
  39. 39.
    Liu, F., Zeng, D., Zhao, Q., Liu, X.: Joint face alignment and 3D face reconstruction. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 545–560. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46454-1_33CrossRefGoogle Scholar
  40. 40.
    Liu, Y., Jourabloo, A., Ren, W., Liu, X.: Dense face alignment. arXiv preprint arXiv:1709.01442 (2017)
  41. 41.
    Maninchedda, F., Häne, C., Oswald, M.R., Pollefeys, M.: Face reconstruction on mobile devices using a height map shape model and fast regularization. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 489–498. IEEE (2016)Google Scholar
  42. 42.
    Maninchedda, F., Oswald, M.R., Pollefeys, M.: Fast 3D reconstruction of faces with glasses. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4608–4617. IEEE (2017)Google Scholar
  43. 43.
    Matthews, I., Baker, S.: Active appearance models revisited. Int. J. Comput. Vis. 60(2), 135–164 (2004)CrossRefGoogle Scholar
  44. 44.
    McDonagh, J., Tzimiropoulos, G.: Joint face detection and alignment with a deformable hough transform model. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 569–580. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_39CrossRefGoogle Scholar
  45. 45.
    Moschoglou, S., Ververas, E., Panagakis, Y., Nicolaou, M., Zafeiriou, S.: Multi-attribute robust component analysis for facial UV maps. arXiv preprint arXiv:1712.05799 (2017)
  46. 46.
    Peng, X., Feris, R.S., Wang, X., Metaxas, D.N.: A recurrent encoder-decoder network for sequential face alignment. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 38–56. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_3CrossRefGoogle Scholar
  47. 47.
    Richardson, E., Sela, M., Kimmel, R.: 3D face reconstruction by learning from synthetic data. In: Fourth International Conference on 3D Vision, pp. 460–469 (2016)Google Scholar
  48. 48.
    Richardson, E., Sela, M., Or-El, R., Kimmel, R.: Learning detailed face reconstruction from a single image (2016)Google Scholar
  49. 49.
    Romdhani, S., Vetter, T.: Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 986–993 (2005)Google Scholar
  50. 50.
    Saito, S., Li, T., Li, H.: Real-time facial segmentation and performance capture from RGB input. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 244–261. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_15CrossRefGoogle Scholar
  51. 51.
    Sánta, Z., Kato, Z.: 3D face alignment without correspondences. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 521–535. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_36CrossRefGoogle Scholar
  52. 52.
    Saragih, J., Goecke, R.: A nonlinear discriminative approach to AAM fitting. In: IEEE 11th International Conference on Computer Vision, ICCV 2007, pp. 1–8. IEEE (2007)Google Scholar
  53. 53.
    Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation (2017)Google Scholar
  54. 54.
    Sinha, A., Unmesh, A., Huang, Q., Ramani, K.: SurfNet: generating 3D shape surfaces using deep residual networks. In: IEEE CVPR, vol. 1 (2017)Google Scholar
  55. 55.
    Tewari, A., et al.: MoFA: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction (2017)Google Scholar
  56. 56.
    Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: Computer Vision and Pattern Recognition, p. 5 (2016)Google Scholar
  57. 57.
    Tran, A.T., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3D morphable models with a very deep neural network (2016)Google Scholar
  58. 58.
    Tzimiropoulos, G., Pantic, M.: Optimization problems for fast AAM fitting in-the-wild. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 593–600. IEEE (2013)Google Scholar
  59. 59.
    Wagner, A., Wright, J., Ganesh, A., Zhou, Z., Mobahi, H., Ma, Y.: Toward a practical face recognition system: robust alignment and illumination by sparse representation. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 372–386 (2012)CrossRefGoogle Scholar
  60. 60.
    Xiong, X., Torre, F.D.L.: Global supervised descent method. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2664–2673 (2015)Google Scholar
  61. 61.
    Xue, N., Deng, J., Cheng, S., Panagakis, Y., Zafeiriou, S.: Side information for face completion: a robust PCA approach. arXiv preprint arXiv:1801.07580 (2018)
  62. 62.
    Yin, L., Wei, X., Sun, Y., Wang, J., Rosato, M.J.: A 3D facial expression database for facial behavior research. In: 7th international conference on Automatic face and gesture recognition, FGR 2006, pp. 211–216. IEEE (2006)Google Scholar
  63. 63.
    Yu, R., Saito, S., Li, H., Ceylan, D., Li, H.: Learning dense facial correspondences in unconstrained images (2017)Google Scholar
  64. 64.
    Zhang, Z., Luo, P., Loy, C.C., Tang, X.: Facial landmark detection by deep multi-task learning. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 94–108. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10599-4_7CrossRefGoogle Scholar
  65. 65.
    Zhao, R., Wang, Y., Benitez-Quiroz, C.F., Liu, Y., Martinez, A.M.: Fast and precise face alignment and 3D shape reconstruction from a single 2D image. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9914, pp. 590–603. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48881-3_41CrossRefGoogle Scholar
  66. 66.
    Zhou, E., Fan, H., Cao, Z., Jiang, Y., Yin, Q.: Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In: 2013 IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 386–391. IEEE (2013)Google Scholar
  67. 67.
    Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3D solution. In: Computer Vision and Pattern Recognition, pp. 146–155 (2016)Google Scholar
  68. 68.
    Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild, pp. 787–796 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Cooperative Medianet Innovation CenterShanghai Jiao Tong UniversityShanghaiChina
  2. 2.CloudWalk TechnologyGuangzhouChina
  3. 3.CIGIT, Chinese Academy of SciencesChongqingChina
  4. 4.University of Chinese Academy of SciencesBeijingChina

Personalised recommendations