3D Head Pose Estimation for TV Setups

  • Julien Leroy
  • Francois Rocca
  • Matei Mancaş
  • Bernard Gosselin
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 124)


In this paper, we present an architecture of a system which aims to personalize the TV content to the viewer reactions. The focus of the paper is on a subset of this system which identifies moments of attentive focus in a non-invasive and continuous way. The attentive focus is used to dynamically improve the user profile by detecting which displayed media or links have drawn the user attention. Our method is based on the detection and estimation of face pose in 3D using a consumer depth camera. Two preliminary experiments were carried out to test the method and to show its link to viewer interest. This study is realized in the scenario of a TV with a second screen interaction (tablet, smartphone), a behaviour that has become common for spectators.


attention head pose estimation second screen interaction eye tracking Facelab future TV personalization 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abe, K., Makikawa, M.: Spatial setting of visual attention and its appearance in head-movement. IFMBE Proceedings 25(4), 1063–1066 (2010)Google Scholar
  2. 2.
    Aldoma, A.: 3d face detection and pose estimation in pcl (September 2012)Google Scholar
  3. 3.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Doshi, A., Trivedi, M.M.: Head and Gaze Dynamics in Visual Attention and Context Learning, pp. 77–84 (2009)Google Scholar
  5. 5.
    Doshi, A., Trivedi, M.M.: Head and eye gaze dynamics during visual attention shifts in complex environments 12, 1–16 (2012)Google Scholar
  6. 6.
    Fanelli, G., Dantone, M., Gall, J., Fossati, A., Gool, L.: Random Forests for Real Time 3D Face Analysis. International Journal of Computer Vision 101(3), 437–458 (2012)CrossRefGoogle Scholar
  7. 7.
    Fanelli, G., Dantone, M., Gall, J., Fossati, A., Gool, L.: Random forests for real time 3d face analysis. International Journal of Computer Vision 101, 437–458 (2013)CrossRefGoogle Scholar
  8. 8.
    Fanelli, G., Gall, J., Van Gool, L.: Real time head pose estimation with random regression forests. In: CVPR 2011, pp. 617–624 (June 2011)Google Scholar
  9. 9.
    Fanelli, G., Gall, J., Van Gool, L.: Real time 3d head pose estimation: Recent achievements and future challenges. In: 2012 5th International Symposium on Communications Control and Signal Processing (ISCCSP), pp. 1–4 (2012)Google Scholar
  10. 10.
    Fanelli, G., Weise, T., Gall, J., Van Gool, L.: Real time head pose estimation from consumer depth cameras. In: Mester, R., Felsberg, M. (eds.) DAGM 2011. LNCS, vol. 6835, pp. 101–110. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  11. 11.
    Fielding, R.T., Taylor, R.N.: Principled design of the modern web architecture. ACM Trans. Internet Technol. 2(2), 115–150 (2002)CrossRefGoogle Scholar
  12. 12.
    Gaschler, A., Jentzsch, S., Giuliani, M., Huth, K., de Ruiter, J., Knoll, A.: Social behavior recognition using body posture and head pose for human-robot interaction. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2128–2133 (October 2012)Google Scholar
  13. 13.
    Henderson, J.: Regarding scenes. Current Directions in Psychological Science 16, 219–222 (2007)CrossRefGoogle Scholar
  14. 14.
    Kliegr, T., Kuchar, J.: Gain: Analysis of implicit feedback on semantically annotated content. In: WIKT 2012, pp. 75–78 (2012)Google Scholar
  15. 15.
    Khan, A.Z., Blohm, G., McPeek, R.M., Lefèvre, P.: Differential influence of attention on gaze and head movements. Journal of Neurophysiology 101(1), 198–206 (2009)CrossRefGoogle Scholar
  16. 16.
    Leroy, J., Rocca, F., Mancas, M., Gosselin, B.: Second screen interaction: An approach to infer tv watcher’s interest using 3d head pose estimation. In: Proceedings of the 22nd International Conference on World Wide Web Companion, WWW 2013 Companion, pp. 465–468. International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva (2013)Google Scholar
  17. 17.
    Microsoft. Kinect sensor,
  18. 18.
    Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(4), 607–626 (2009)CrossRefGoogle Scholar
  19. 19.
    OpenNI. Open natural interfaces,
  20. 20.
    PCL. Point cloud library
  21. 21.
  22. 22.
    Riche, N., Mancas, M., Duvinage, M., Gosselin, B., Dutoit, T.: Rare2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis. In: Signal Processing: Image Communication (2013)Google Scholar
  23. 23.
  24. 24.
    Vinciarelli, A., Pantic, M., Bourlard, H.: Social signal processing: Survey of an emerging domain. Image and Vision Computing 27(12), 1743–1759 (2009)CrossRefGoogle Scholar
  25. 25.
    Wright, R.D., Ward, L.M.: Orienting of attention. Oxford University Press (2008)Google Scholar

Copyright information

© ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering 2013

Authors and Affiliations

  • Julien Leroy
    • 1
  • Francois Rocca
    • 1
  • Matei Mancaş
    • 1
  • Bernard Gosselin
    • 1
  1. 1.Faculty of Engineering (FPMs)University of Mons (UMONS)MonsBelgium

Personalised recommendations