Improving Face Pose Estimation Using Long-Term Temporal Averaging for Stochastic Optimization

Passalis, Nikolaos; Tefas, Anastasios

doi:10.1007/978-3-319-65172-9_17

Improving Face Pose Estimation Using Long-Term Temporal Averaging for Stochastic Optimization

Nikolaos Passalis¹³ &
Anastasios Tefas¹³

Conference paper
First Online: 02 August 2017

2918 Accesses
4 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 744))

Abstract

Among the most crucial components of an intelligent system capable of assisting drone-based cinematography is estimating the pose of the main actors. However, training deep CNNs towards this task is not straightforward, mainly due to the noisy nature of the data and instabilities that occur during the learning process, significantly slowing down the development of such systems. In this work we propose a temporal averaging technique that is capable of stabilizing as well as speeding up the convergence of stochastic optimization techniques for neural network training. We use two face pose estimation datasets to experimentally verify that the proposed method can improve both the convergence of training algorithms and the accuracy of pose estimation. This also reduces the risk of stopping the training process when a bad descent step was taken and the learning rate was not appropriately set, ensuring that the network will perform well at any point of the training process.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12(Jul), 2121–2159 (2011)
MathSciNet MATH Google Scholar
Goudelis, G., Tefas, A., Pitas, I.: Emerging biometric modalities: a survey. J. Multimodal User Interfaces 2(3), 217–235 (2008)
Article Google Scholar
Gourier, N., Hall, D., Crowley, J.L.: Estimating face orientation from robust detection of salient facial structures. In: FG NET Workshop on Visual Observation of Deictic Gestures (2004)
Google Scholar
Haykin, S., Network, N.: A comprehensive foundation. Neural Netw. 2(2004), 41 (2004)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 448–456 (2015)
Google Scholar
Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al.: What is the best multi-stage architecture for object recognition? In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2146–2153 (2009)
Google Scholar
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Koestinger, M., Wohlhart, P., Roth, P.M., Bischof, H.: Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies (2011)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot MultiBox detector. In: Proceedings of the European Conference on Computer Vision, pp. 21–37 (2016)
Google Scholar
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Nousi, P., Tefas, A.: Deep learning algorithms for discriminant autoencoding. Neurocomputing (2017)
Google Scholar
Passalis, N., Tefas, A.: Learning neural bag-of-features for large-scale image retrieval. IEEE Trans. Syst. Man Cybern.: Syst. (2017)
Google Scholar
Passalis, N., Tefas, A.: Neural bag-of-features learning. Pattern Recogn. 64, 277–294 (2017)
Article Google Scholar
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
Article MathSciNet MATH Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Ruppert, D.: Efficient estimations from a slowly convergent robbins-monro process. Cornell University Operations Research and Industrial Engineering, Technical report (1988)
Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)
Toshev, A., Szegedy, C.: DeepPose: human pose estimation via deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1653–1660 (2014)
Google Scholar
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)
Google Scholar

Download references

Acknowledgments

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 731667 (MULTIDRONE). This publication reflects the authors’ views only. The European Commission is not responsible for any use that may be made of the information it contains.

Author information

Authors and Affiliations

Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
Nikolaos Passalis & Anastasios Tefas

Authors

Nikolaos Passalis
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Tefas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolaos Passalis .

Editor information

Editors and Affiliations

Politecnico di Milano, Milan, Italy
Giacomo Boracchi
Democritus University of Thrace, University Campus, Xanthi, Greece
Lazaros Iliadis
School of Computing Science and Digital Media, Robert Gordon University, Aberdeen, United Kingdom
Chrisina Jayne
Univesity of Ioannina, Ioannina, Greece
Aristidis Likas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Passalis, N., Tefas, A. (2017). Improving Face Pose Estimation Using Long-Term Temporal Averaging for Stochastic Optimization. In: Boracchi, G., Iliadis, L., Jayne, C., Likas, A. (eds) Engineering Applications of Neural Networks. EANN 2017. Communications in Computer and Information Science, vol 744. Springer, Cham. https://doi.org/10.1007/978-3-319-65172-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-65172-9_17
Published: 02 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65171-2
Online ISBN: 978-3-319-65172-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics