Efficient deformable 3D face model tracking with limited hardware resources

  • 13 Accesses


Face fitting methods align deformable models to faces on images using the information given by the image pixels. However, most algorithms are designed to be used in desktop personal computers (PC), or hardware with significant computational power. These approaches are therefore too demanding for devices with limited computational power, like the increasingly used ARM-based devices. Besides the hardware limitations, the particularities of each operating system include additional challenges to the implementation of real-time face tracking solutions. To fill the lack of methods designed for platforms with a limited computational power we present an efficient way to fit 3D human face models to monocular images. This approach estimates the head pose and gesture in a 3D environment based on a full perspective projection, using parametric non-linear optimisation. We compare the performance of this method running it on similar ARM-based devices with different operating systems (Linux, Android, and iOS). In all cases, we have measured both accuracy and performance. The efficiency of the method makes it possible to run it in real-time (\(\backsim \)30fps) on devices with limited computational power like smartphones and embedded systems. These kind of efficient methods are a vital component for human behaviour analysis applications, like driver monitoring systems and human-machine interfaces for disabled people among others.

This is a preview of subscription content, log in to check access.

Access options

Buy single article

Instant unlimited access to the full article PDF.

US$ 39.95

Price includes VAT for USA

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

US$ 199

This is the net price. Taxes to be calculated in checkout.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13


  1. 1.

    More information about the Nvidia hardware can be found in their sites and

  2. 2.

    The projection of the 3D model is not centred in the image because the face region was cropped to improve the visibility of the face region in this text. However, the projection of the object corresponds to the centre of the original image.


  1. 1.

    Ahlberg J (2001) Candide-3 - an updated parameterized face, Technical Report LiTH-ISY-r-2326, Image Coding Group, Dept. of Electrical Engineering, Linköping University, Sweden

  2. 2.

    Aldrian O, Smith WAP (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35:1080–1093

  3. 3.

    Baltrusaitis T, Robinson P, Morency LP (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE winter conference on applications of computer vision, WACV 2016

  4. 4.

    Bulat A, Tzimiropoulos G (2017) Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In: IEEE international conference on computer vision (ICCV) (Venice, Italy). IEEE

  5. 5.

    Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: IEEE International Conference on Computer Vision (ICCV). arXiv:1703.07332. IEEE, Venice, Italy

  6. 6.

    Cao C, Weng Y, Lin S, Zhou K (2013) 3D shape regression for real-time facial animation. ACM Trans Graph 32(4):1–10

  7. 7.

    Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Graph 33(4):1–10

  8. 8.

    Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20:413–425

  9. 9.

    Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190

  10. 10.

    Cao C, Chai M, Woodford O, Luo L (2018) Stabilized real-time face tracking via a learned dynamic rigidity prior. ACM Trans Graph 37(6):233

  11. 11.

    Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23:681–685

  12. 12.

    Deng Z, Li K, Zhao Q, Zhang Y, Chen H (2017) Effective face landmark localization via single deep network. CoRR, arXiv:1702.02719

  13. 13.

    Deng J, Guo J, Zhou Y, Yu J, Kotsia I, Zafeiriou S (2019) RetinaFace: single-stage dense face localisation in the wild

  14. 14.

    Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) European conference on computer vision (ECCV). (Munich), Springer, Cham

  15. 15.

    Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications

  16. 16.

    Huber P, Hu G, Tena R, Mortazavian P, Koppen WP (2015) A multiresolution 3D morphable face model and fitting framework international conference on computer vision theory and applications (visapp)

  17. 17.

    Huber P, Hu G, Tena R, Mortazavian P, Koppen WP, Christmas WJ, Rätsch M, Kittler J (2016) A multiresolution 3d morphable face model and fitting framework. In: Proceedings of the 11th joint conference on computer vision, imaging and computer graphics theory and applications (VISIGRAPP), pp 79–86

  18. 18.

    Jeni LA, Cohn JF, Kanade T (2015) Dense 3d face alignment from 2d videos in real-time. In: 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 1, pp 1–8

  19. 19.

    Kazemi V, Josephine S (2014) One millisecond face alignment with an ensemble of regression trees. Computer Vision and Pattern Recognition (CVPR)

  20. 20.

    King DE (2015) Max-margin object detection. CoRR, arXiv:1502.00046

  21. 21.

    Lewis JP (1995) Fast template matching. Pattern Recogn 10(11):120–123

  22. 22.

    Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 9905 LNCS, pp 21–37

  23. 23.

    Markus N, Frljak M, Pandzic IS, Ahlberg J, Forchheimer R (2013) A method for object detection based on pixel intensity comparisons. CoRR, arXiv:1305.4537

  24. 24.

    Martin Koestinger PMR, Wohlhart Paul, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proc. first IEEE international workshop on benchmarking facial image analysis technologies

  25. 25.

    Ostermann J (2003) Face animation in MPEG-4. Wiley, New York, pp 17–55

  26. 26.

    Richard Hartley AZ (2003) Multiple view geometry, vol. 53. ACS

  27. 27.

    Saragih JM, Lucey S, Cohn JF (2009) Face alignment through subspace constrained mean-shifts. In: IEEE 12th international conference on computer vision, pp 1034–1041

  28. 28.

    Shen J, Zafeiriou S, Chrysos GG, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: benchmark and results. In: 2015 IEEE international conference on computer vision workshop (ICCVW), pp 1003–1011

  29. 29.

    Unzueta L, Pimenta W, Goenetxea J, Santos L, Dornaika F (2014) Efficient generic face model fitting to images and videos. Image and Vision Computing 32(5):321–334

  30. 30.

    Weng Y, Cao C, Hou Q, Zhou K (2014) Real-time facial animation on mobile devices. Graphical Models 76(3):172–179

  31. 31.

    Yves Bouguet J (2000) Pyramidal implementation of the lucas kanade feature tracker. Intel Corporation, Microprocessor Research Labs

  32. 32.

    Zhang X, Sugano Y, Fritz M, Bulling A (2016) MPIIGaze : in-the-wild dataset and deep appearance-based gaze estimation, pp 1–14

  33. 33.

    Zhu S, Li C, Loy CC, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07-12-June, pp 4998–5006

  34. 34.

    Zhu X, Lei Z, Liu X, Shi H, Li SZ (2015) Face alignment across large poses: a 3d solution. CoRR, arXiv:1511.07212

Download references


This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 690772, VI-DAS project).

Author information

Correspondence to Jon Goenetxea.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

(MP4 13.0 MB)

(MP4 12.7 MB)

(MP4 13.7 MB)

(MP4 13.0 MB)

(MP4 12.7 MB)

(MP4 13.7 MB)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Goenetxea, J., Unzueta, L., Dornaika, F. et al. Efficient deformable 3D face model tracking with limited hardware resources. Multimed Tools Appl (2020) doi:10.1007/s11042-019-08515-y

Download citation


  • Head pose estimation
  • Face tracking
  • Efficient computation
  • Computer vision