Advertisement

Multimedia Tools and Applications

, Volume 73, Issue 3, pp 1863–1884 | Cite as

An effective head pose estimation approach using Lie Algebrized Gaussians based face representation

  • Chunlong Hu
  • Liyu Gong
  • Tianjiang Wang
  • Fang Liu
  • Qi FengEmail author
Article

Abstract

The accuracy of head pose estimation is significant for many computer vision applications such as face recognition, driver attention detection and human-computer interaction. Most appearance-based head pose estimation works typically extract the low-dimensional face appearance features in some statistic subspaces, where the subspaces represent the underlying geometry structure of the pose space. However, there is an open problem, namely, how to effectively represent appearance-based subspace face for the head pose estimation problem. To address the problem, this paper proposes a head pose estimation approach based on the Lie Algebrized Gaussians (LAG) feature to model the pose characteristic. LAG is built on Gaussian Mixture Models (GMM), which actually not only models the distribution of local appearance features, but also captures the Lie group manifold structure of the feature space. Moreover, to keep multi-resolution structure information, LAG is operated on many subregions of the image. As a result, these properties of LAG enable it to effectively model the structure of subspace face which can lead to powerful discriminative ability for head pose estimation. After representing subspace face using the LAG, we treat the head pose estimation as a classification problem. The within-class covariance normalization (WCCN) based Support Vector Machine (SVM) classifier is employed to achieve robust performance as WCCN could reduce the within-class variabilities of the same pose. Extensive experimental analysis and comparison with both traditional and state-of-the-art algorithms on two challenging benchmarks demonstrate the effectiveness of our approach.

Keywords

Head pose estimation Structure properties Gaussian mixture models Lie algebrized Gaussians Classification 

Notes

Acknowledgements

Thank the editors and the anonymous referees for their valuable comments. This work was supported by the National Natural Science Foundation of China under grant number 61073094 and U1233119. The authors would also like to thank Xinwei Jiang for his help.

References

  1. 1.
    Ba SO, Odobez JM (2004) A probabilistic framework for joint head tracking and pose estimation. In: IEEE international conference on pattern recognition (ICPR)Google Scholar
  2. 2.
    Balasubramanian V, Ye J, Panchanathan S (2007) Biased manifold embedding: a framework for person-independent head pose estimation. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  3. 3.
    Beymer D (1994) Face recognition under varying pose. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 756–761Google Scholar
  4. 4.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, New YorkzbMATHGoogle Scholar
  5. 5.
    Blanz V, Grother P, Phillips PJ, Vetter T (2005) Face recognition based on frontal views generated from non-frontal images. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 454–461Google Scholar
  6. 6.
    Bo L, Ren X, Fox D (2009) Kernel descriptors for visual recognition. In: Annual conference on neural information processing systemsGoogle Scholar
  7. 7.
    Brown LM, Tian YL (2002) Comparative study of coarse head pose estimation. In: IEEE workshop on motion and video computing, pp 125–130Google Scholar
  8. 8.
    Chen D, Bourlard H, Thiran JP (2001) Text identification in complex background using svm. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  9. 9.
    Cusano C, Ciocca G, Schettini R (2003) Image annotation using svm. In: International Society for Optics and Photonics, pp 330–338Google Scholar
  10. 10.
    Dong L, Tao L, Xu G (2010) Head pose estimation using covariance of oriented gradients. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP), pp 1470–1473Google Scholar
  11. 11.
    Gong L, Wang T, Liu F (2009) Shape of gaussians as feature descriptors. In: IEEE computer vision and pattern recognition (CVPR), pp 2366–2371Google Scholar
  12. 12.
    Gong L, Chen M, Hu C (2013) Lie algebrized gaussians for image representation. ArXiv:1304.0823
  13. 13.
    Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial features. In: Proceedings of pointing 2004, ICPR, international workshop on visual observation of deictic gesturesGoogle Scholar
  14. 14.
    Gourier N, Maisonnasse J, Hall D, Crowley JL (2006) Head pose estimation on low resolution images. In: CLEAR workshop, in conjunction with face and gestureGoogle Scholar
  15. 15.
    Haj M, Gonzalez J, Davis L (2012) On partial least squares in head pose estimation: how to simultaneously deal with misalignment. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  16. 16.
    Hatch A, Kajarekar S, Stolcke A (2006) Within-class covariance normalization for svm-based speaker recognition. In: Proceedings of ICSLP-interspeechGoogle Scholar
  17. 17.
    Horprasert T, Yacoob Y, Davis L (1996) Computing 3-d head orientation from a monocular image sequence. In: International conference on automatic face and gesture recognition, pp 242–247Google Scholar
  18. 18.
    Hu C, Gong L, Wang T, Feng Q (2013) Effective head pose estimation using lie algebrized gaussians. In: IEEE international conference on multimedia and expo (ICME)Google Scholar
  19. 19.
    Huang J, Shao X, Wechsler H (1998) Face pose discrimination using support vector machines (svm). In: IEEE international conference on pattern recognition (ICPR), pp 154–156Google Scholar
  20. 20.
    Kruger B, Bruns S, Sommer G (2000) Efficient head pose estimation with gabor wavelets. In: British machine vision conference (BMVC), pp 11–14Google Scholar
  21. 21.
    Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2169–2178Google Scholar
  22. 22.
    Li Z, Fu Y, Yuan J, Huang TS, Wu Y (2007) Query driven localized linear discriminant models for head pose estimation. In: IEEE international conference on multimedia and expo (ICME), pp 1810–1813Google Scholar
  23. 23.
    Ma B, Zhang W, Shan S, Chen X, Gao W (2008) Robust head pose estimation using lgbp. In: IEEE international conference on pattern recognition (ICPR), pp 512–515Google Scholar
  24. 24.
    Moon H, Miller M (2004) Estimating facial pose from a sparse representation. In: IEEE international conference on image processing (ICIP), pp 75–78Google Scholar
  25. 25.
    Murphy-Chutorian E, Trivedi MM (2008) Hyhope: hybrid head orientation and position estimation for vision-based driver head tracking. In: IEEE intelligent vehicles symposium, pp 512–517Google Scholar
  26. 26.
    Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626CrossRefGoogle Scholar
  27. 27.
    Niyogi S, Freeman W (1996) Example-based head tracking. In: International conference on automatic face and gesture recognition, pp 374–378Google Scholar
  28. 28.
    Ranganathan A, Yang MH (2008) Online sparse matrix gaussian process regression and vision applications. In: IEEE European conference on computer vision (ECCV), pp 468–482Google Scholar
  29. 29.
    Reynolds DA, Quatieri TF, Dunn RB (2000) Speaker verification using adapted gaussian mixture models. Digital Signal Process 10(1):19–41CrossRefGoogle Scholar
  30. 30.
    Robertson N, Reid I (2006) Estimating gaze direction from low-resolution faces in video. In: IEEE European conference on computer vision (ECCV)Google Scholar
  31. 31.
    Sherrah J, Gong S, Ong EJ (2001) Face distributions in similarity space under varying head pose. Image Vision Comput 19(12):807–819CrossRefGoogle Scholar
  32. 32.
    Sim T, Baker S, Bsat M (2003) The cmu pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618CrossRefGoogle Scholar
  33. 33.
    Srinivasan S, Boyer K (2002) Head pose estimation using view based eigenspaces. In: IEEE international conference on pattern recognition (ICPR), pp 302–305Google Scholar
  34. 34.
    Stiefelhagen R (2002) Tracking focus of attention in meetings. In: IEEE international conference on multimodal interfaces (ICMI)Google Scholar
  35. 35.
    Stiefelhagen R (2004) Estimating head pose with neural networks-results on the pointing’04 icpr workshop evaluation data. In: Proceedings of pointing 2004 workshop: visual observation of deictic gesturesGoogle Scholar
  36. 36.
    Tian YL, Brown L, Connell J, Pankanti S, Hampapur A, Senior A, Bolle R (2003) Absolute head pose estimation from overhead wide-angle cameras. In: IEEE international workshop on analysis and modeling of faces and gestures, pp 92–99Google Scholar
  37. 37.
    Tu J, Fu Y, Hu Y, Huang TS (2006) Evaluation of head pose estimation for studio data. In: CLEAR workshop, in conjunction with face and gesture, pp 281–290Google Scholar
  38. 38.
    Turk M, Pentland AP (1991) Face recognition using eigenfaces. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  39. 39.
    Voit M, Nickel K, Stiefelhagen R (2006) Neural network based head pose estimation and multi-view fusion. In: CLEAR workshop, in conjunction with face and gestureGoogle Scholar
  40. 40.
    Wang JG, Sung E (2007) Em enhancement of 3d head pose estimated by point at infinity. Image Vision Comput 25(12):1864–1874CrossRefGoogle Scholar
  41. 41.
    Wu J, Trivedi M (2008) A two-stage head pose estimation framework and evaluation. Pattern Recognit 41(3):1138–1158CrossRefzbMATHGoogle Scholar
  42. 42.
    Wu JW, Pedersen JM, Putthividhya D, Norgaard D, Trivedi MM (2004) A two-level pose estimation framework using majority voting of gabor wavelets and bunch graph analysis. In: Proceedings of pointing 2004 workshop: visual observation of deictic gestures, pp 4–12Google Scholar
  43. 43.
    Yan S, Zhou X, Liu M, Hasegawa-Johnson M, Huang TS (2008) Regression from patch-kernel. In: IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  44. 44.
    Zhou X, Cui N, Li Z, Liang F, Huang TS (2009) Hierarchical gaussianization for image classification. In: IEEE international conference on computer vision (ICCV), pp 1971–1977Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Chunlong Hu
    • 1
  • Liyu Gong
    • 2
  • Tianjiang Wang
    • 1
  • Fang Liu
    • 1
  • Qi Feng
    • 1
    Email author
  1. 1.School of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhanChina
  2. 2.Eedoo IncBeijingChina

Personalised recommendations