Benchmarking Head Pose Estimation in-the-Wild

  • Elvira Amador
  • Roberto Valle
  • José M. Buenaposada
  • Luis Baumela
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10657)

Abstract

Head pose estimation systems have quickly evolved from simple classifiers estimating a few yaw angles, to the most recent regression approaches that provide precise 3D face orientations in images acquired “in-the-wild”. Accurate evaluation of these algorithms is an open issue. Although the most recent approaches are tested using a few challenging annotated databases, their published results are not comparable. In this paper we review these works, define a common evaluation methodology, and establish a new state-of-the-art for this problem.

Keywords

Head pose estimation Convolutional neural networks 

Notes

Acknowledgments

The authors gratefully acknowledge computer resources provided by the Super-computing and Visualization Center of Madrid (CeSViMa) and funding from the Spanish Ministry of Economy and Competitiveness under projects TIN2013-47630-C2-2-R and TIN2016-75982-C2-2-R.

References

  1. 1.
    Ba, S.O., Odobez, J.M.: Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Trans. Pattern Anal. Mach. Intell. 33(1), 101–116 (2011)CrossRefGoogle Scholar
  2. 2.
    Dantone, M., Gall, J., Fanelli, G., Gool, L.V.: Real-time facial feature detection using conditional regression forests. In: Proceedings Conference on Computer Vision and Pattern Recognition (2012)Google Scholar
  3. 3.
    DeMenthon, D., Davis, L.S.: Model-based object pose in 25 lines of code. Int. J. Comput. Vis. 15(1–2), 123–141 (1995)CrossRefGoogle Scholar
  4. 4.
    Demirkus, M., Precup, D., Clark, J.J., Arbel, T.: Soft biometric trait classification from real-world face videos conditioned on head pose estimation. In: Proceedings of Conference on Computer Vision and Pattern Recognition Workshops (2012)Google Scholar
  5. 5.
    Fanelli, G., Dantone, M., Gall, J., Fossati, A., Van Gool, L.: Random forests for real time 3D face analysis. Int. J. Comput. Vis. 101(3), 437–458 (2013)CrossRefGoogle Scholar
  6. 6.
    Gao, B.B., Xing, C., Xie, C.W., Wu, J., Geng, X.: Deep label distribution learning with label ambiguity. IEEE Trans. Image Process. 26(6), 2825–2838 (2016)MathSciNetCrossRefGoogle Scholar
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  8. 8.
    Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)Google Scholar
  9. 9.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of Neural Information Processing Systems (NIPS) (2012)Google Scholar
  10. 10.
    Kumar, A., Alavi, A., Chellappa, R.: KEPLER: keypoint and pose estimation of unconstrained faces by learning efficient H-CNN regressors. In: Proceedings of International Conference on Automatic Face and Gesture Recognition (2017)Google Scholar
  11. 11.
    Lee, D., Yang, M., Oh, S.: Fast and accurate head pose estimation via random projection forests. In: Proceedings of Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  12. 12.
    Marín-Jiménez, M.J., Zisserman, A., Eichner, M., Ferrari, V.: Detecting people looking at each other in videos. Int. J. Comput. Vis. 106(3), 282–296 (2014)CrossRefGoogle Scholar
  13. 13.
    Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009)CrossRefGoogle Scholar
  14. 14.
    Peng, X., Huang, J., Hu, Q., Zhang, S., Metaxas, D.N.: Three-dimensional head pose estimation in-the-wild. In: Proceedings of International Conference on Automatic Face and Gesture Recognition (2015)Google Scholar
  15. 15.
    Ranjan, R., Patel, V.M., Chellappa, R.: HyperFace: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. CoRR abs/1603.01249 (2016)Google Scholar
  16. 16.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  17. 17.
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings Conference on Computer Vision and Pattern Recognition (2015)Google Scholar
  18. 18.
    Valenti, R., Sebe, N., Gevers, T.: Combining head pose and eye location information for gaze estimation. IEEE Trans. Image Process. 21(2), 802–815 (2012)MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Valle, R., Buenaposada, J.M., Valdés, A., Baumela, L.: Head-pose estimation in-the-wild using a random forest. In: Proceedings of Articulated Motion and Deformable Objects (AMDO) (2016)Google Scholar
  20. 20.
    Yang, H., Mou, W., Zhang, Y., Patras, I., Gunes, H., Robinson, P.: Face alignment assisted by head pose estimation. In: Proceedings of British Machine Vision Conference (2015)Google Scholar
  21. 21.
    Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: Proceedings of Conference on Computer Vision and Pattern Recognition (2012)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Elvira Amador
    • 1
  • Roberto Valle
    • 1
  • José M. Buenaposada
    • 2
  • Luis Baumela
    • 1
  1. 1.Univ. Politécnica MadridMadridSpain
  2. 2.Univ. Rey Juan CarlosMóstolesSpain

Personalised recommendations