Multimedia Tools and Applications

, Volume 75, Issue 13, pp 7799–7829 | Cite as

In-plane face orientation estimation in still images

  • Taner Danisman
  • Ioan Marius Bilasco


This paper addresses a fine in-plane (roll) face orientation estimation for a perspective face analysis algorithm that requires normalized frontal faces. As most of the face analysers (e.g., gender, expression, and recognition) need frontal up-right faces, there is a clear need for the precise roll estimation, as precise face normalization has an important role in classification methods. The in-plane orientation estimation algorithm is constructed on top of regular Viola-Jones frontal face detector. When a face is detected for the first time, it is rotated with respect to the face origin to find the boundaries of the detection. Mean value of these angles is said to be the measurement of the in-plane rotation of the face. Since we only need a face detection algorithm, the proposed method can work effectively on very small sized faces where traditional landmark (eye, mouth) or planar detection based estimations fail. Experiments on controlled and unconstrained large-scale datasets (CMU Rotated, YouTube, Boston University Face Tracking, Caltech, FG-NET Aging, BioID and Manchester Talking-Face) showed that the proposed method is robust to various settings for in-plane face orientation estimation in terms of RMSE and MAE. We achieved less than ±3.5 mean absolute error for roll estimation which proves that the accuracy of the proposed method is comparable to that of the state-of-the-art tracking based approaches for the roll estimation.


In-plane rotation estimation Roll estimation Head-pose estimation 



This study is supported by TWIRL (ITEA2 10029 - Twinning Virtual World Online Information with Real-World Data Sources) project.


  1. 1.
    An KH, Chung MJ (2008) 3D head tracking and pose-robust 2D texture map-based face recognition using a simple ellipsoid model. In: IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, pp 307–312Google Scholar
  2. 2.
    Asteriadis S, Karpouzis K, Kollias S (2010) Head pose estimation with one camera, in uncalibrated environments. In: Proceedings of the 2010 workshop on eye gaze in intelligent human machine interaction, ACM, New York, NY, USA, EGIHMI ’10. doi: 10.1145/2002333.2002343, pp 55–62
  3. 3.
    Ba S, Odobez J (2004) A probabilistic framework for joint head tracking and pose estimation. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. doi: 10.1109/ICPR.2004.1333754, vol 4, pp 264–267
  4. 4.
    Castrillon M, Deniz O, Guerra C, Hernandez M (2007) Encara2: Real-time detection of multiple faces at different resolutions in video streams. J Vis Commun Image R 18(2):130–140. doi: 10.1016/j.jvcir.2006.11.004 CrossRefGoogle Scholar
  5. 5.
    Cootes TF (2004). Manchester talking face video dataset., date last accessed: 02.02.2013
  6. 6.
    Dahmane A, Larabi S, Djeraba C (2010) Detection and analysis of symmetrical parts on face for head pose estimation. In: 17th IEEE International Conference on Image Processing (ICIP). doi: 10.1109/ICIP.2010.5651202, pp 3249 –3252
  7. 7.
    Dahmane A, Larabi S, Bilasco I, Djeraba C (2014) Head pose estimation based on face symmetry analysis. Signal Image Video P pp 1–10. doi: 10.1007/s11760-014-0676-x
  8. 8.
    Danisman T, Bilasco IM, Ihaddadene N, Djeraba C (2010) Automatic facial feature detection for facial expression recognition. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 2, INSTICC Press, pp 407–412Google Scholar
  9. 9.
    Danisman T, Bilasco I, Djeraba C (2014) Cross-database evaluation of normalized raw pixels for gender recognition under unconstrained settings. In: 22nd IEEE International Conference on Pattern Recognition, ICPR, Stockholm, Sweden, p 2014Google Scholar
  10. 10.
    Demirkus M, Clark J, Arbel T (2013) Robust semi-automatic head pose labeling for real-world face video sequences. Multimedia Tools Applications pp 1–29. doi: 10.1007/s11042-012-1352-1
  11. 11.
    Du S, Zheng N, You Q, Wu Y, Yuan M, Wu J (2006) Rotated haar-like features for face detection with in-plane rotation. In: Zha H, Pan Z, Thwaites H, Addison A, Forte M (eds) Interactive Technologies and Sociotechnical Systems, LNCS. doi: 10.1007/11890881_15, vol 4270. Springer Berlin Heidelberg, pp 128–137
  12. 12.
    Face and gesture recognition working group (2000) FGNET Aging dataset., date last accessed: 22.05.2009
  13. 13.
    Guo W, Kotsia I, Patras I (2011) Higher order support tensor regression for head pose estimation. In: 12th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)Google Scholar
  14. 14.
    Jesorsky O, Kirchberg K, Frischholz R (2001) Robust face detection using the hausdorff distance. In: Bigun J, Smeraldi F (eds) Audio- and Video-Based Biometric Person Authentication, LNCS. doi: 10.1007/3-540-45344-X_14, vol 2091. Springer Berlin Heidelberg, pp 90–95
  15. 15.
    Jia H, Zhang Y, Wang W, Xu J (2012) Accelerating Viola-Jones face detection algorithm on gpus. In: High Performance Computing and Communication 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on. doi: 10.1109/HPCC.2012.60, pp 396–403
  16. 16.
    Jung S, Nixon MS (2012) On using gait to enhance frontal face extraction. IEEE Trans Inf Forensic Secur 7(6):1802–1811. doi: 10.1109/TIFS.2012.2218598 CrossRefGoogle Scholar
  17. 17.
    La Cascia M, Sclaroff S, Athitsos V (2000) Fast, reliable head tracking under varying illumination: An approach based on registration of texture-mapped 3D models. IEEE Trans Pattern Anal Mach Intell 22(4):322–336. doi: 10.1109/34.845375 CrossRefGoogle Scholar
  18. 18.
    Lefevre S, Odobez J (2010) View-based appearance model online learning for 3D deformable face tracking. In: Richard P, Braz J (eds) VISAPP 2010 - Proceedings of the Fifth International Conference on Computer Vision Theory and Applications, Angers, France, May 17-21, 2010 - Volume 1. INSTICC Press, pp 223–230Google Scholar
  19. 19.
    Morency LP, Whitehill J, Movellan J (2010) Monocular head pose estimation using generalized adaptive view-based appearance model. Image Vis Comput 28(5):754–761. doi: 10.1016/j.imavis.2009.08.004, best of Automatic Face and Gesture Recognition 2008CrossRefGoogle Scholar
  20. 20.
    Murphy-Chutorian E, Trivedi M (2008) Hyhope: Hybrid head orientation and position estimation for vision-based driver head tracking. In: Intelligent Vehicles Symposium, 2008 IEEE. doi: 10.1109/IVS.2008.4621320, pp 512–517
  21. 21.
    Murphy-Chutorian E, Trivedi M (2009) Head pose estimation in computer vision: A survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626. doi: 10.1109/TPAMI.2008.106 CrossRefGoogle Scholar
  22. 22.
    My VD, Zell A (2013) Real time face tracking and pose estimation using an adaptive correlation filter for human-robot interaction. ECMR, pp 119–124Google Scholar
  23. 23.
    Oka K, Sato Y, Nakanishi Y, Koike H (2005) Head pose estimation system based on particle filtering with adaptive diffusion control. In: Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2005), May 16-18, 2005, Tsukuba Science City, Japan., pp 586–589
  24. 24.
    Osadchy M, Cun YL, Miller ML (2007) Synergistic face detection and pose estimation with energy-based models. J Mach Learn Res 8:1197–1215.
  25. 25.
    Pan H, Zhu Y, Xia L (2013) Efficient and accurate face detection using heterogeneous feature descriptors and feature selection. Comput Vis Image Understand 117(1):12–28. doi: 10.1016/j.cviu.2012.09.003 CrossRefGoogle Scholar
  26. 26.
    Pathangay V, Das S, Greiner T (2008) Symmetry-based face pose estimation from a single uncalibrated view. In: 8th IEEE International Conference on Automatic Face Gesture Recognition, FG ’08. doi: 10.1109/AFGR.2008.4813312, pp 1 –8
  27. 27.
    Rowley HA, Baluja S, Kanade T (1998) Rotation invariant neural network-based face detection. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 38–44Google Scholar
  28. 28.
    Sung J, Kanade T, Kim D (2008) Pose robust face tracking by combining active appearance models and Cylinder Head Models. Int J Comput Vis 80:260–274. doi: 10.1007/s11263-007-0125-1 CrossRefGoogle Scholar
  29. 29.
    Tran NT, Ababsa FE, Charbit M, Feldmar J, Petrovska-Delacrétaz D, Chollet G (2013) 3D face pose and animation tracking via eigen-decomposition based bayesian approach. In: Bebis G, Boyle R, Parvin B, Koracin D, Li B, Porikli F, Zordan V, Klosowski J, Coquillart S, Luo X, Chen M, Gotz D (eds) Advances in Visual Computing, LNCS. doi: 10.1007/978-3-642-41914-0_55, vol 8033. Springer Berlin Heidelberg, pp 562–571
  30. 30.
    Valenti R, Yücel Z, Gevers T (2009) Robustifying eye center localization by head pose cues. In: IEEE conference on Computer Vision and Pattern Recognition. CVPR, IEEE, pp 612–618Google Scholar
  31. 31.
    Viola M, Jones MJ, Viola P (2003) Fast multi-view face detection. Technical Report TR2003-96, Mitsubishi Electric Research Laboratories, 201 Broadway Cambridge, MAGoogle Scholar
  32. 32.
    Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57:137–154. doi: 10.1023/B:VISI.0000013087.49260.fb CrossRefGoogle Scholar
  33. 33.
    Voit M (2007) Clear 2007 evaluation plan: Head pose estimation.
  34. 34.
    Wang JG, Sung E (2007) EM enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874. doi: 10.1016/j.imavis.2005.12.017, the age of human computer interaction
  35. 35.
    Wang YQ (2014) An Analysis of the Viola-Jones Face Detection Algorithm. Image Processing On Line 4:128–148. doi: 10.5201/ipol.2014.104 CrossRefGoogle Scholar
  36. 36.
    Weber M (1999). Caltech frontal face dataset., date last accessed: 02.02.2013
  37. 37.
    Wolf L, Hassner T, Maoz I (2011) Face recognition in unconstrained videos with matched background similarity. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20-25. doi: 10.1109/CVPR.2011.5995566. IEEE, pp 529–534
  38. 38.
    Wu B, Ai H, Huang C, Lao S (2004) Fast rotation invariant multi-view face detection based on real adaboost. In: IEEE International Conference on Automatic Face and Gesture Recognition. doi: 10.1109/AFGR.2004.1301512, pp 79–84
  39. 39.
    Wu S, Jiang L, Xie S, Yeo AC (2006) A robust method for detecting facial orientation in infrared images. Pattern Recogn 39(2):303–309. doi: 10.1016/j.patcog.2005.06.003, part Special Issue: Complexity ReductionCrossRefGoogle Scholar
  40. 40.
    Wu S, Lin W, Xie S (2008) Skin heat transfer model of facial thermograms and its application in face recognition. Pattern Recogn 41(8):2718–2729. doi: 10.1016/j.patcog.2008.01.003 CrossRefGoogle Scholar
  41. 41.
    Xiao J, Kanade T, Cohn JF (2002) Robust full-motion recovery of head by dynamic templates and re-registration techniques. In: Proceedings of the Fifth IEEE International Conference on Automatic Face and Gesture Recognition, IEEE, Washington, DC, USA, FGR ’02., p 163
  42. 42.
    Yan S, Zhang Z, Fu Y, Hu Y, Tu J, Huang T (2008) Learning a person-independent representation for precise 3D pose estimation. In: Stiefelhagen R, Bowers R, Fiscus J (eds) Multimodal Technologies for Perception of Humans, Lecture Notes in Computer Science. doi: 10.1007/978-3-540-68585-2_28, vol 4625. Springer Berlin Heidelberg, pp 297–306
  43. 43.
    Zhao G, Chen L, Song J, Chen G (2007) Large head movement tracking using sift-based registration. In: Proceedings of the 15th International Conference on Multimedia, ACM, New York, NY, USA, MULTIMEDIA ’07. doi: 10.1145/1291233.1291416, pp 807–810
  44. 44.
    Zhao S, Yao H, Sun X (2013) Video classification and recommendation based on affective analysis of viewers. Neurocomputing 119(0):101–110. doi:10.1016/j.neucom.2012.04.042, Intelligent Processing Techniques for Semantic-based Image and Video RetrievalCrossRefGoogle Scholar
  45. 45.
    Zhou J, Lu XG, Zhang D, Wu CY (2002) Orientation analysis for rotated human face detection. Image Vis Comput 20(4):257–264. doi: 10.1016/S0262-8856(02)00018-5 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.IRCICA, Parc Scientifique de la Haute BorneLille 1 UniversityVilleneuve d’AscqFrance
  2. 2.Faculty of Engineering, Computer Engineering DepartmentAkdeniz UniversityDumlupinar BulvariTurkey

Personalised recommendations