Sparse-MVRVMs Tree for Fast and Accurate Head Pose Estimation in the Wild

Selim, Mohamed; Pagani, Alain; Stricker, Didier

doi:10.1007/978-3-319-64689-3_20

Mohamed Selim¹⁶,
Alain Pagani¹⁶ &
Didier Stricker¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10424))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

1358 Accesses

Abstract

Head pose estimation is an important problem in the field of computer vision and facial analysis. We model the problem of head pose estimation as a regression problem, where the three rotation angles (yaw, pitch, roll) are functions of the face appearance. We make use of that fact and learn the appearance of the face using a tree cascade of sparse Multi-Variate Relevance Vector Machines (MVRVM). Our method is fast and suitable for real-time applications as it is not computationally expensive. Our method learns the face appearance to estimate the head rotation angles. We evaluated our approach on two challenging datasets, the YouTube Faces and the Point and Shoot Challenging (PaSC) dataset. We achieved results of head pose estimation (yaw, pitch, roll) with mean error less than 5\(\circ \) and with error tolerance less than ±4 on the PaSC dataset. In terms of speed, one prediction takes around 6 milliseconds, which is suitable for real-time applications and also with high frame rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Asteriadis, S., Soufleros, D., Karpouzis, K., Kollias, S.: A natural head pose and eye gaze dataset. In: Proceedings of the International Workshop on Affective-Aware Virtual Agents and Social Robots. ACM (2009)
Google Scholar
Asthana, A., Zafeiriou, S., Cheng, S., Pantic, M.: Incremental face alignment in the wild. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1859–1866. IEEE (2014)
Google Scholar
Beveridge, J.R., Phillips, P.J., Bolme, D.S., Draper, B.A., Givens, G.H., Lui, Y.M., Teli, M.N., Zhang, H., Scruggs, W.T., Bowyer, K.W., Flynn, P.J., Cheng, S.: The challenge of face recognition from digital point-and-shoot cameras. In: 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), pp. 1–8, September 2013
Google Scholar
Blanz, V., Vetter, T.: Face recognition based on fitting a 3d morphable model. IEEE Trans. Pattern Anal. Mach. Intell. 25(9), 1063–1074 (2003)
Article Google Scholar
Cootes, T.F., Wheeler, G.V., Walker, K.N., Taylor, C.J.: View-based active appearance models. Image Vis. Comput. 20(9), 657–664 (2002)
Article Google Scholar
Cristina, S., Camilleri, K.P.: Model-free head pose estimation based on shape factorisation and particle filtering. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9257, pp. 628–639. Springer, Cham (2015). doi:10.1007/978-3-319-23117-4_54
Chapter Google Scholar
Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 617–624. IEEE (2011)
Google Scholar
Gu, L., Kanade, T.: 3D alignment of face in a single image. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 1305–1312. IEEE (2006)
Google Scholar
Huang, G.B., Ramesh, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report 07–49, University of Massachusetts, Amherst, October 2007
Google Scholar
Jones, M., Viola, P.: Fast multi-view face detection. Mitsubishi Electric Research Lab TR-20003-96, 3:14 (2003)
Google Scholar
Mukherjee, S.S., Robertson, N.M.: Deep head pose: Gaze-direction estimation in multimodal video. IEEE Trans. Multimedia 17(11), 2094–2107 (2015)
Article Google Scholar
Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 607–626 (2009)
Google Scholar
Pentland, A., Moghaddam, B., Starner, T.: View-based and modular eigenspaces for face recognition. In 1994 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1994. Proceedings CVPR’94, pp. 84–91. IEEE (1994)
Google Scholar
Jonathon Phillips, P., Moon, H., Rizvi, S.A., Rauss, P.J.: The feret evaluation methodology for face-recognition algorithms. IEEE Trans. Pattern Anal. Mach. Intell. 22(10), 1090–1104 (2000)
Article Google Scholar
Selim, M., Pagani, A., Stricker, D.: Real-time head pose estimation using multi-variate rvm on faces in the wild. In: Computer Analysis of Images and Patterns (2015)
Google Scholar
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P.H.S., Cipolla, R.: Multivariate relevance vector machines for tracking. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 124–138. Springer, Heidelberg (2006). doi:10.1007/11744078_10
Chapter Google Scholar
Valenti, R., Sebe, N., Gevers, T.: Combining head pose and eye location information for gaze estimation. IEEE Trans. Image Process. 21(2), 802–815 (2012)
Article MathSciNet Google Scholar
Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 529–534. IEEE (2011)
Google Scholar
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2879–2886. IEEE (2012)
Google Scholar

Download references

Acknowledgments

This work has been partially funded by the University project Zentrums für Nutzfahrzeugtechnologie (ZNT), and the European project Eyes of Things (EoT) under contract number GA643924.

Author information

Authors and Affiliations

Augmented Vision Research Group, German Research Center for Artificial Intelligence (DFKI), Technical University of Kaiserslautern, Tripstaddterstr. 122, 67663, Kaiserslautern, Germany
Mohamed Selim, Alain Pagani & Didier Stricker

Authors

Mohamed Selim
View author publications
You can also search for this author in PubMed Google Scholar
Alain Pagani
View author publications
You can also search for this author in PubMed Google Scholar
Didier Stricker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohamed Selim .

Editor information

Editors and Affiliations

Linköping University, Linköping, Sweden
Michael Felsberg
Lund University, Lund, Sweden
Anders Heyden
University of Southern Denmark, Odense, Denmark
Norbert Krüger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Selim, M., Pagani, A., Stricker, D. (2017). Sparse-MVRVMs Tree for Fast and Accurate Head Pose Estimation in the Wild. In: Felsberg, M., Heyden, A., Krüger, N. (eds) Computer Analysis of Images and Patterns. CAIP 2017. Lecture Notes in Computer Science(), vol 10424. Springer, Cham. https://doi.org/10.1007/978-3-319-64689-3_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-64689-3_20
Published: 28 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64688-6
Online ISBN: 978-3-319-64689-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics