Multimedia Tools and Applications

, Volume 78, Issue 14, pp 19141–19162 | Cite as

Head pose estimation using improved label distribution learning with fewer annotations

  • Luhui Xu
  • Jingying ChenEmail author
  • Yanling Gan


Head pose estimation in unconstrained environment remains a challenging task due to background clutter, illumination changes, and appearance variabilities. Multivariate label distribution has been successfully applied to head pose estimation. However, it is not applicable to unconstrained environments where assigning reasonable label distributions for images is difficult, and its performance significantly degrades when accurate grid information is unavailable (e.g., only yaw angles are known). To alleviate these problems, we propose an improved label distribution learning approach with fewer annotations. A data-driven weak learning strategy is first developed to construct label distributions to alleviate the problem of unreasonable label distributions. Regularization terms (e.g., L1,2 norm) are then introduced into the loss function induced by weighted Jeffreys divergence to avoid over-fitting. To further ameliorate the performance, positive correlation and negative competition are also introduced into the loss function to fine-tune the parameters of the corresponding model. Extensive experiments have been conducted on public databases: LFW and Pointing04. The proposed method achieves comparable performance over the state-of-art and possesses good generalization ability, but uses only fewer annotations, which suggests that it has strong potential for head pose estimation in unconstrained environments where sufficient annotations are routinely unavailable.


Head pose estimation Weak learning Regularization Fine-tune 



This work was supported by the National Key Research and Development Program of China (Grant No. 2018YFB1004504), Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE (Grant No. CCNU17ZDJC04).


  1. 1.
    Bao J, Ye M (2017) Head pose estimation based on robust convolutional neural network. Cybernetics and Information Technologies 16(6):133–145CrossRefGoogle Scholar
  2. 2.
    Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3(1):1–122CrossRefzbMATHGoogle Scholar
  3. 3.
    Chang CC, Lin CJ (2011) Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27CrossRefGoogle Scholar
  4. 4.
    Chen J, Wu J, Richter K, Konrad J, Ishwar P (2016) Estimating head pose orientation using extremely low resolution images. In: 2016 IEEE Southwest symposium on image analysis and interpretation (SSIAI), pp 65–68Google Scholar
  5. 5.
    Chen X, Pan W, Kwok JT, Carbonell JG (2009) Accelerated gradient method for multi-task sparse learning problem. In: IEEE international conference on data mining, pp 746–751Google Scholar
  6. 6.
    Eleftheriadis S, Rudovic O, Pantic M (2014) View-constrained latent variable model for multi-view facial expression classification. In: International symposium on visual computing, pp 292–303Google Scholar
  7. 7.
    Fanelli G, Gall J, Van Gool L (2011) Real time head pose estimation with random regression forests. In: Computer vision and pattern recognition, pp 617–624Google Scholar
  8. 8.
    Fanelli G, Gall J, Dantone M, Gool LV (2012) Real-time facial feature detection using conditional regression forests. In: Computer vision and pattern recognition, pp 2578–2585Google Scholar
  9. 9.
    Felzenszwalb P, Mcallester D, Ramanan D (2008) A discriminatively trained, multiscale, deformable part model. In: 2008. CVPR 2008. IEEE conference on computer vision and pattern recognition, pp 1?-8Google Scholar
  10. 10.
    Fu Y, Huang TS (2006) Graph embedded analysis for head pose estimation. In: International conference on automatic face and gesture recognition, pp 6–8Google Scholar
  11. 11.
    Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28 (7):1734–1748CrossRefGoogle Scholar
  12. 12.
    Geng X, Xia Y (2014) Head pose estimation based on multivariate label distribution. In: Computer vision and pattern recognition, pp 1837–1842Google Scholar
  13. 13.
    Gourier N, Crowley JL (2004) Estimating face orientation from robust detection of salient facial structures. In: IEEE international conference on pattern recognition international workshop on visual observation of deictic gestures, pp 183–191Google Scholar
  14. 14.
    Hara K, Chellappa R (2014) Growing regression forests by classification: Applications to object pose estimation. In: European conference on computer vision, pp 552–567Google Scholar
  15. 15.
    Hara K, Chellappa R (2017) Growing regression tree forests by classification for continuous object pose estimation. Int J Comput Vis 122(2):292–312MathSciNetCrossRefGoogle Scholar
  16. 16.
    Hu C, Gong L, Wang T, Liu F, Feng Q (2014) An effective head pose estimation approach using lie algebrized gaussians based face representation. Multimed Tools Appl 73(3):1863–1884CrossRefGoogle Scholar
  17. 17.
    Huang GB, Mattar M, Berg T, Learned-Miller E (2007) Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. MonthGoogle Scholar
  18. 18.
    Huttunen H, Chen K, Thakur A, Krohn-Grimberghe A, Gencoglu O, Ni X, Al-Musawi M, Xu L, Veen HJV (2015) Computer vision for head pose estimation: Review of a competition. In: Scandinavian conference on image analysis, pp 65–75Google Scholar
  19. 19.
    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Commun ACM 60(2):2012Google Scholar
  20. 20.
    Lee D, Yang MH, Oh S (2015) Fast and accurate head pose estimation via random projection forests. In: IEEE international conference on computer vision, pp 1958–1966Google Scholar
  21. 21.
    Ma B, Huang R, Qin L (2015) Vod: a novel image representation for head yaw estimation. Neurocomputing 148:455–466CrossRefGoogle Scholar
  22. 22.
    Murphy-Chutorian E, Doshi A, Trivedi MM (2007) Head pose estimation for driver assistance systems: A robust algorithm and experimental evaluation. In: Intelligent transportation systems conference, 2007. Itsc, pp 709–714Google Scholar
  23. 23.
    Patacchiola M, Cangelosi A (2017) Head pose estimation in the wild using convolutional neural networks and adaptive gradient methods. Pattern Recogn 71:132–143CrossRefGoogle Scholar
  24. 24.
    Rajamanoharan G, Cootes TF (2015) Multi-view constrained local models for large head angle facial tracking. In: IEEE international conference on computer vision workshop, pp 971–978Google Scholar
  25. 25.
    Ranjan R, Patel VM, Chellappa R (2016) Hyperface: a deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell PP(99):1–1Google Scholar
  26. 26.
    Sang GL, Chen H, Huang G, Zhao QJ (2016) Unseen head pose prediction using dense multivariate label distribution. Frontiers of Information Technology and Electronic Engineering 17(6):516–526CrossRefGoogle Scholar
  27. 27.
    Sundararajan K, Woodard DL (2015) Head pose estimation in the wild using approximate view manifolds. In: Computer vision and pattern recognition workshops, pp 50–58Google Scholar
  28. 28.
    Vieriu RL, Tulyakov S, Semeniuta S, Sangineto E, Sebe N (2015) Facial expression recognition under a wide range of head poses. In: IEEE international conference and workshops on automatic face and gesture recognition, pp 1–7Google Scholar
  29. 29.
    Wang X, Guo X, Lei Z, Zhang C, Li SZ (2017) Exclusivity-consistency regularized multi-view subspace clustering. In: IEEE conference on computer vision and pattern recognition, pp 1–9Google Scholar
  30. 30.
    Yun WH, Lee D, Park C, Kim J (2015) Automatic engagement level estimation of kids in a learning environment. International Journal of Machine Learning and Computing 5(2):148–152CrossRefGoogle Scholar
  31. 31.
    Zhang T, Ghanem B, Liu S, Ahuja N (2012) Robust visual tracking via multi-task sparse learning. In: IEEE conference on computer vision and pattern recognition, pp 2042–2049Google Scholar
  32. 32.
    Zhao K, Chu WS, Torre FDL, Cohn JF, Zhang H (2015) Joint patch and multi-label learning for facial action unit detection. In: Computer vision and pattern recognition, pp 2207–2216Google Scholar
  33. 33.
    Zheng W (2014) Multi-view facial expression recognition based on group sparse reduced-rank regression. IEEE Trans Affect Comput 5(1):71–85CrossRefGoogle Scholar
  34. 34.
    Zhong L, Liu Q, Yang P, Liu B (2012) Learning active facial patches for expression analysis. In: IEEE conference on computer vision and pattern recognition, pp 2562–2569Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.National Engineering Research Center for E-LearningCentral China Normal UniversityWuhanChina

Personalised recommendations