Leveraging deep learning with symbolic sequences for robust head poses estimation

  • Hayet MekamiEmail author
  • Abdennacer Bounoua
  • Sidahmed Benabderrahmane
Short paper


Head pose estimation is a challenging topic in computer vision with a large area of applications. There are a lot of methods which have been presented in the literature to undertake pose estimation so far. Even though the efficiency of these methods is acceptable, the sensitivity to external conditions is still being a big challenge. In this paper, we come up with a new model to overcome the problem of head poses estimation. First, the face images are converted into one-dimensional vectors as a time series using the Peano–Hilbert space-filling curve. Then, we convert these numerical series into symbolic sequences with adequate dimensionality reduction approaches. These sequences are then used as input of an encode–decoder neural network to learn and generate labels of the faces orientations. We have evaluated our model on several databases, and the experimental results have shown that the proposed method is very competitive compared to other well-known approaches.


Head pose estimation Time series Encode–decoder recurrent network Symbolic aggregate approximation Sequence to sequence 



  1. 1.
    Alioua N, Amine A, Rogozan A, Bensrhair A, Rziza M (2016) Driver head pose estimation using efficient descriptor fusion. EURASIP J Image and Video Process 2016(1):2CrossRefGoogle Scholar
  2. 2.
    Wang B, Liang W, Wang Y, Liang Y (2013) Head pose estimation with combined 2D sift and 3D hog features. In: 2013 seventh international conference on image and graphics (ICIG), IEEE, pp 650–655Google Scholar
  3. 3.
    Jones M, Viola P (2003) Fast multi-view face detection. Mitsubishi Electr Res Lab TR-20003-96 3(14):2Google Scholar
  4. 4.
    Sutskever I, Vinyals O,  Le QV (2014) Sequence to sequence learning with neural networks. Advances in NIPSGoogle Scholar
  5. 5.
    Mekami H, Benabderrahmane S (2010) Towards a new approach for real time face detection and normalization. In: 2010 international conference on machine and web intelligence (ICMWI), IEEE, pp 455–459Google Scholar
  6. 6.
    Murphy-Chutorian E, Trivedi MM (2009) Head pose estimation in computer vision: a survey. IEEE Trans Pattern Anal Mach Intell 31(4):607–626CrossRefGoogle Scholar
  7. 7.
    Wiskott L, Würtz RP, Westphal G (2014) Elastic bunch graph matching. Scholarpedia 9(3):10587CrossRefGoogle Scholar
  8. 8.
    Elagin E, Steffens J, Neven H (1998) Automatic pose estimation system for human faces based on bunch graph matching technology. In: Third IEEE international conference on automatic face and gesture recognition, 1998. Proceedings. IEEE, pp 136–141Google Scholar
  9. 9.
    Wang J-G, Sung E (2007) Em enhancement of 3D head pose estimated by point at infinity. Image Vis Comput 25(12):1864–1874CrossRefGoogle Scholar
  10. 10.
    Ohue K, Yamada Y, Uozumi S, Tokoro S, Hattori A, Hayashi T (2006) Development of a new pre-crash safety system, Technical report, SAE Technical PaperGoogle Scholar
  11. 11.
    Narayanan A, Kaimal RM, Bijlani K (2014) Yaw estimation using cylindrical and ellipsoidal face models. IEEE Trans Intell Transp Syst 15(5):2308–2320CrossRefGoogle Scholar
  12. 12.
    Niyogi S, Freeman W (1996) Example-based head tracking. In: Proceedings of international conference automatic face and gesture recognition, pp 374–378Google Scholar
  13. 13.
    Li C, Zhong F, Zhang Q, Qin X (2018) Accurate and fast 3D head pose estimation with noisy RGBD images. Multimed Tools Appl 77(12):14605–14624CrossRefGoogle Scholar
  14. 14.
    Schulz A, Stiefelhagen R (2012) Video-based pedestrian head pose estimation for risk assessment. In: 2012 15th international IEEE conference on intelligent transportation systems (ITSC). IEEE, pp 1771–1776Google Scholar
  15. 15.
    Han B, Lee S, Yang HS (2014) Head pose estimation using image abstraction and local directional quaternary patterns for multiclass classification. Pattern Recog Lett 45:145–153CrossRefGoogle Scholar
  16. 16.
    Li W, Huang Y, Peng J (2014) Automatic and robust head pose estimation by block energy map. In: 2014 IEEE international conference on image processing (ICIP). IEEE, pp 3357–3361Google Scholar
  17. 17.
    Bailly K, Milgram M (2009) Boosting feature selection for neural network based regression. Neural Netw 22(5):748–756CrossRefGoogle Scholar
  18. 18.
    Fanelli G, Gall J, Van Gool L (2011) Real time head pose estimation with random regression forests. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 617–624Google Scholar
  19. 19.
    Zhu R, Sang G, Cai Y, You J,  Zhao Q (2013) Head pose estimation with improved random regression forests. In: Chinese conference on biometric recognition. Springer, Cham, pp 457–465CrossRefGoogle Scholar
  20. 20.
    Al Haj M, Gonzalez J, Davis LS (2012) On partial least squares in head pose estimation: how to simultaneously deal with misalignment. In: 2012 IEEE conference on computer vision and pattern recognitin (CVPR). IEEE, pp 2602–2609Google Scholar
  21. 21.
    Drouard V, Horaud R, Deleforge A, Ba S, Evangelidis G (2017) Robust head-pose estimation based on partially-latent mixture of linear regressions. IEEE Trans Image Process 26(3):1428–1440MathSciNetCrossRefGoogle Scholar
  22. 22.
    Balasubramanian VN, Ye J, Panchanathan S (2007) Biased manifold embedding: a framework for person-independent head pose estimation. In: 2007. CVPR’07. IEEE conference on computer vision and pattern recognition. IEEE, pp 1–7Google Scholar
  23. 23.
    Huang D, Storer M, De la Torre F, Bischof H (2011) Supervised local subspace learning for continuous head pose estimation. In: 2011 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 2921–2928Google Scholar
  24. 24.
    Foytik J, Asari VK (2013) A two-layer framework for piecewise linear manifoldbased head pose estimation. Int J Comput Vis 101(2):270–287MathSciNetCrossRefGoogle Scholar
  25. 25.
    Liu Y, Wang Q, Jiang Y, Lei Y (2014) Supervised locality discriminant manifold learning for head pose estimation. Knowl Based Syst 66:126–135CrossRefGoogle Scholar
  26. 26.
    Diaz-Chito K, Del Rincón JM, Hernández-Sabaté A, Gil D (2018) Continuous head pose estimation using manifold subspace embedding and multivariate regression. IEEE Access 6:18325–18334CrossRefGoogle Scholar
  27. 27.
    Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828CrossRefGoogle Scholar
  28. 28.
    Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRefGoogle Scholar
  29. 29.
    Zhang Z, Luo P, Loy CC, Tang X (2016) Learning deep representation for face alignment with auxiliary attributes. IEEE Trans Pattern Anal Mach Intell 38(5):918–930CrossRefGoogle Scholar
  30. 30.
    Zhang Z, Luo P, Loy CC, Tang X (2014) Facial landmark detection by deep multi-task learning. In: European conference on computer vision. Springer, pp 94–108Google Scholar
  31. 31.
    Venturelli M, Borghi G, Vezzani R, Cucchiara R (2017) From depth data to head pose estimation: a siamese approach. arXiv:1703.03624
  32. 32.
    Patacchiola M, Cangelosi A (2017) Head pose estimation in the wild using convolutional neural networks and adaptive gradient methods. Pattern Recognit 71:132–143CrossRefGoogle Scholar
  33. 33.
    Ruiz N, Chong E, Rehg JM (2018) Fine-grained head pose estimation without keypoints. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 2074–2083Google Scholar
  34. 34.
    Lathuiliere S, Juge R, Mesejo P, Munoz-Salinas R, Horaud R (2017) Deep mixture of linear inverse regressions applied to head-pose estimation. In: IEEE conference on computer vision and pattern recognition, vol. 3, p 7Google Scholar
  35. 35.
    Borghi G, Gasparini R, Vezzani R, Cucchiara R (2017) Embedded recurrent network for head pose estimation in car. In: 2017 IEEE intelligent vehicles symposium (IV). IEEE, pp 1503–1508Google Scholar
  36. 36.
    Xia J, Cao L, Zhang G, Liao J (2019) Head pose estimation in the wild assisted by facial landmarks based on convolutional neural networks. IEEE Access 7:48470–48483CrossRefGoogle Scholar
  37. 37.
    Gupta A, Thakkar K, Gandhi V, Narayanan PJ (2019) Nose, eyes and ears: head pose estimation by locating facial keypoints. In: ICASSP 2019–2019 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1977–1981Google Scholar
  38. 38.
    Xu L, Chen J, Gan Y (2019) Head pose estimation with soft labels using regularized convolutional neural network. Neurocomputing 337:339–353CrossRefGoogle Scholar
  39. 39.
    Hsu HW, Wu TY, Wan S, Wong WH, Lee CY (2018) QuatNet: Quaternion-Based head pose estimation with multiregression loss. IEEE Trans Multimed 21(4):1035–1046CrossRefGoogle Scholar
  40. 40.
    Benabderrahmane Sidahmed, Mellouli Nedra, Lamolle Myriam, Paroubek Patrick (2017) Smart4Job: A big data framework for intelligent job offers broadcasting using time series forecasting and semantic classification. Big Data Res 7:16–30CrossRefGoogle Scholar
  41. 41.
    Benabderrahmane Sidahmed, Mellouli Nedra, Lamolle Myriam (2018) On the predictive analysis of behavioral massive job data using embedded clustering and deep recurrent neural networks. Knowl Based Syst 151:95–113CrossRefGoogle Scholar
  42. 42.
    Benabderrahmane S, Quiniou R,  Guyet T (2014) Evaluating distance measures and times series clustering for temporal patterns retrieval. In: Proceedings of the 2014 IEEE 15th international conference on information reuse and integration (IEEE IRI 2014). IEEE, pp 434–441  Google Scholar
  43. 43.
    Fu T-C (2011) A review on time series data mining. Eng Appl Artif Intell 24(1):164–181CrossRefGoogle Scholar
  44. 44.
    Kleist C (2015) Time series data mining methods: a review. Unpublished master’s thesis). Humboldt-Universität zu Berlin, GermanyGoogle Scholar
  45. 45.
    Wang X, Mueen A, Ding H, Trajcevski G, Scheuermann P, Keogh E (2013) Experimental comparison of representation methods and distance measures for time series data. Data Min Knowl Discov 26(2):275–309MathSciNetCrossRefGoogle Scholar
  46. 46.
    Lin J, Keogh E, Wei L, Lonardi S (2007) Experiencing sax: a novel symbolic representation of time series. Data Min Knowl Discov 15(2):107–144MathSciNetCrossRefGoogle Scholar
  47. 47.
    Mekami H, Benabderrahmane S (2017) Sax2face: Estimating facial poses with Peano–Hilbert curves and sax symbolic time series. Procedia Comput Sci 109:217–224CrossRefGoogle Scholar
  48. 48.
    Mekami H, Benabderrahmane S, Bounoua A, Taleb-Ahmed A (2018) Local patterns and big time series data for facial poses classification. J Comput 13(1):18–35CrossRefGoogle Scholar
  49. 49.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  50. 50.
    Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv:1406.1078
  51. 51.
    Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681CrossRefGoogle Scholar
  52. 52.
    Phillips PJ, Wechsler H, Huang J, Rauss PJ (1998) The feret database and evaluation procedure for face-recognition algorithms. Image Vis Comput 16(5):295–306CrossRefGoogle Scholar
  53. 53.
    Gao W, Cao B, Shan S, Chen X, Zhou D, Zhang X, Zhao D (2008) The cas-peal large-scale chinese face database and baseline evaluations. IEEE Trans Syst Man Cybern Part A Syst Hum 38(1):149–161CrossRefGoogle Scholar
  54. 54.
    Gourier N, Hall D, Crowley JL (2004) Estimating face orientation from robust detection of salient facial features. In: ICPR international workshop on visual observation of Deictic Gestures, CiteseerGoogle Scholar
  55. 55.
    Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: A system for largescale machine learning. In: OSDI, vol 16. pp 265–283Google Scholar
  56. 56.
    Ma B, Li A, Chai X, Shan S (2014) Covga: A novel descriptor based on symmetry of regions for head pose estimation. Neurocomputing 143:97–108CrossRefGoogle Scholar
  57. 57.
    Ma B, Huang R, Qin L (2015) Vod: a novel image representation for head yaw estimation. Neurocomputing 148:455–466CrossRefGoogle Scholar
  58. 58.
    Huang C, Ding X, Fang C (2010) Head pose estimation based on random forests for multiclass classification. In: 2010 20th international conference on pattern recognition (ICPR). IEEE, pp 934–937Google Scholar
  59. 59.
    Cai Y, Yang M-L, Li J (2015) Multiclass classification based on a deep convolutional network for head pose estimation. Front Inf Technol Electron Eng 16(11):930–939CrossRefGoogle Scholar
  60. 60.
    Ma B, Shan S, Chen X, Gao W (2008) Head yaw estimation from asymmetry of facial appearance. IEEE Trans Syst Man Cybern Part B Cybern 38(6):1501–1512CrossRefGoogle Scholar
  61. 61.
    Gao B-B, Xing C, Xie C-W, Wu J, Geng X (2017) Deep label distribution learning with label ambiguity. IEEE Trans Image Process 26(6):2825–2838MathSciNetCrossRefGoogle Scholar
  62. 62.
    Geng X, Xia Y (2014) Head pose estimation based on multivariate label distribution. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1837–1842Google Scholar
  63. 63.
    Liu Y, Chen J, Su Z, Luo Z, Luo N, Liu L, Zhang K (2016) Robust head pose estimation using dirichlet-tree distribution enhanced random forests. Neurocomputing 173:42–53CrossRefGoogle Scholar
  64. 64.
    Liu Y, Xie Z, Yuan X, Chen J, Song W (2017) Multi-level structured hybrid forest for joint head detection and pose estimation. Neurocomputing 266:206–215CrossRefGoogle Scholar
  65. 65.
    Geng X (2016) Label distribution learning. IEEE Trans Knowl Data Eng 28(7):1734–1748CrossRefGoogle Scholar
  66. 66.
    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision. Springer, pp 818–833Google Scholar
  67. 67.
    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Djillali Liabes UniversitySidi Bel AbbésAlgeria
  2. 2.The University of EdinburghEdinburghUK

Personalised recommendations