Face fitting methods align deformable models to faces on images using the information given by the image pixels. However, most algorithms are designed to be used in desktop personal computers (PC), or hardware with significant computational power. These approaches are therefore too demanding for devices with limited computational power, like the increasingly used ARM-based devices. Besides the hardware limitations, the particularities of each operating system include additional challenges to the implementation of real-time face tracking solutions. To fill the lack of methods designed for platforms with a limited computational power we present an efficient way to fit 3D human face models to monocular images. This approach estimates the head pose and gesture in a 3D environment based on a full perspective projection, using parametric non-linear optimisation. We compare the performance of this method running it on similar ARM-based devices with different operating systems (Linux, Android, and iOS). In all cases, we have measured both accuracy and performance. The efficiency of the method makes it possible to run it in real-time (\(\backsim \)30fps) on devices with limited computational power like smartphones and embedded systems. These kind of efficient methods are a vital component for human behaviour analysis applications, like driver monitoring systems and human-machine interfaces for disabled people among others.
This is a preview of subscription content, log in to check access.
Buy single article
Instant unlimited access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
More information about the Nvidia hardware can be found in their sites https://developer.nvidia.com/embedded/buy/jetson-tx1 and https://www.nvidia.com/en-us/self-driving-cars/drive-px.
The projection of the 3D model is not centred in the image because the face region was cropped to improve the visibility of the face region in this text. However, the projection of the object corresponds to the centre of the original image.
Ahlberg J (2001) Candide-3 - an updated parameterized face, Technical Report LiTH-ISY-r-2326, Image Coding Group, Dept. of Electrical Engineering, Linköping University, Sweden
Aldrian O, Smith WAP (2013) Inverse rendering of faces with a 3d morphable model. IEEE Trans Pattern Anal Mach Intell 35:1080–1093
Baltrusaitis T, Robinson P, Morency LP (2016) Openface: an open source facial behavior analysis toolkit. In: 2016 IEEE winter conference on applications of computer vision, WACV 2016
Bulat A, Tzimiropoulos G (2017) Binarized convolutional landmark localizers for human pose estimation and face alignment with limited resources. In: IEEE international conference on computer vision (ICCV) (Venice, Italy). IEEE
Bulat A, Tzimiropoulos G (2017) How far are we from solving the 2D & 3D Face Alignment problem? (and a dataset of 230,000 3D facial landmarks). In: IEEE International Conference on Computer Vision (ICCV). arXiv:1703.07332. IEEE, Venice, Italy
Cao C, Weng Y, Lin S, Zhou K (2013) 3D shape regression for real-time facial animation. ACM Trans Graph 32(4):1–10
Cao C, Hou Q, Zhou K (2014) Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans Graph 33(4):1–10
Cao C, Weng Y, Zhou S, Tong Y, Zhou K (2014) Facewarehouse: a 3d facial expression database for visual computing. IEEE Trans Vis Comput Graph 20:413–425
Cao X, Wei Y, Wen F, Sun J (2014) Face alignment by explicit shape regression. Int J Comput Vis 107(2):177–190
Cao C, Chai M, Woodford O, Luo L (2018) Stabilized real-time face tracking via a learned dynamic rigidity prior. ACM Trans Graph 37(6):233
Cootes TF, Edwards GJ, Taylor CJ (2001) Active appearance models. IEEE Trans Pattern Anal Mach Intell 23:681–685
Deng Z, Li K, Zhao Q, Zhang Y, Chen H (2017) Effective face landmark localization via single deep network. CoRR, arXiv:1702.02719
Deng J, Guo J, Zhou Y, Yu J, Kotsia I, Zafeiriou S (2019) RetinaFace: single-stage dense face localisation in the wild
Feng Y, Wu F, Shao X, Wang Y, Zhou X (2018) Joint 3D face reconstruction and dense alignment with position map regression network. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) European conference on computer vision (ECCV). (Munich), Springer, Cham
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) MobileNets: efficient convolutional neural networks for mobile vision applications
Huber P, Hu G, Tena R, Mortazavian P, Koppen WP (2015) A multiresolution 3D morphable face model and fitting framework international conference on computer vision theory and applications (visapp)
Huber P, Hu G, Tena R, Mortazavian P, Koppen WP, Christmas WJ, Rätsch M, Kittler J (2016) A multiresolution 3d morphable face model and fitting framework. In: Proceedings of the 11th joint conference on computer vision, imaging and computer graphics theory and applications (VISIGRAPP), pp 79–86
Jeni LA, Cohn JF, Kanade T (2015) Dense 3d face alignment from 2d videos in real-time. In: 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), vol 1, pp 1–8
Kazemi V, Josephine S (2014) One millisecond face alignment with an ensemble of regression trees. Computer Vision and Pattern Recognition (CVPR)
King DE (2015) Max-margin object detection. CoRR, arXiv:1502.00046
Lewis JP (1995) Fast template matching. Pattern Recogn 10(11):120–123
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) SSD: single shot multibox detector. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol 9905 LNCS, pp 21–37
Markus N, Frljak M, Pandzic IS, Ahlberg J, Forchheimer R (2013) A method for object detection based on pixel intensity comparisons. CoRR, arXiv:1305.4537
Martin Koestinger PMR, Wohlhart Paul, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proc. first IEEE international workshop on benchmarking facial image analysis technologies
Ostermann J (2003) Face animation in MPEG-4. Wiley, New York, pp 17–55
Richard Hartley AZ (2003) Multiple view geometry, vol. 53. ACS
Saragih JM, Lucey S, Cohn JF (2009) Face alignment through subspace constrained mean-shifts. In: IEEE 12th international conference on computer vision, pp 1034–1041
Shen J, Zafeiriou S, Chrysos GG, Kossaifi J, Tzimiropoulos G, Pantic M (2015) The first facial landmark tracking in-the-wild challenge: benchmark and results. In: 2015 IEEE international conference on computer vision workshop (ICCVW), pp 1003–1011
Unzueta L, Pimenta W, Goenetxea J, Santos L, Dornaika F (2014) Efficient generic face model fitting to images and videos. Image and Vision Computing 32(5):321–334
Weng Y, Cao C, Hou Q, Zhou K (2014) Real-time facial animation on mobile devices. Graphical Models 76(3):172–179
Yves Bouguet J (2000) Pyramidal implementation of the lucas kanade feature tracker. Intel Corporation, Microprocessor Research Labs
Zhang X, Sugano Y, Fritz M, Bulling A (2016) MPIIGaze : in-the-wild dataset and deep appearance-based gaze estimation, pp 1–14
Zhu S, Li C, Loy CC, Tang X (2015) Face alignment by coarse-to-fine shape searching. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, vol 07-12-June, pp 4998–5006
Zhu X, Lei Z, Liu X, Shi H, Li SZ (2015) Face alignment across large poses: a 3d solution. CoRR, arXiv:1511.07212
This work has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 690772, VI-DAS project).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Goenetxea, J., Unzueta, L., Dornaika, F. et al. Efficient deformable 3D face model tracking with limited hardware resources. Multimed Tools Appl (2020) doi:10.1007/s11042-019-08515-y
- Head pose estimation
- Face tracking
- Efficient computation
- Computer vision