Shape and Appearance Based Sequenced Convnets to Detect Real-Time Face Attributes on Mobile Devices

Livet, Nicolas; Berkowski, George

doi:10.1007/978-3-319-94544-6_8

Shape and Appearance Based Sequenced Convnets to Detect Real-Time Face Attributes on Mobile Devices

Nicolas Livet¹⁵ &
George Berkowski¹⁵

Conference paper
First Online: 17 June 2018

736 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10945))

Abstract

In computer vision, classifying facial attributes has attracted deep interest from researchers and corporations. Deep Neural Network based approaches are now widely spread for such tasks and have reached higher detection accuracies than previously manually-designed approaches. Our paper reports how preprocessing and face image alignment influence accuracy scores when detecting face attributes. More importantly it demonstrates how the combination of a representation of the shape of a face and its appearance, organized as a sequence of convolutional neural networks, improves classification scores of facial attributes when compared with previous work on the FER+ dataset. While most studies in the field have tried to improve detection accuracy by averaging multiple very deep networks, exposed work concentrates on building efficient models while maintaining high accuracy scores. By taking advantage of the face shape component and relying on an efficient shallow CNN architecture, we unveil the first available, highly accurate real-time implementation on mobile browsers.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Deepar face features tracker for augmented reality apps (2016). http://www.deepar.com
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015), Software. https://www.tensorflow.org/
Barsoum, E., Zhang, C., Canton Ferrer, C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: ACM International Conference on Multimodal Interaction (ICMI) (2016)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Carcagnì, P., Del Coco, M., Leo, M., Distante, C.: Facial expression recognition and histograms of oriented gradients: a comprehensive study. SpringerPlus 4(1), 645 (2015). https://doi.org/10.1186/s40064-015-1427-3
Chen, J., Chen, Z., Chi, Z., Fu, H.: Facial expression recognition based on facial components detection and hog features (2014)
Google Scholar
The Computer Vision Machine Learning Team: An on-device deep neural network for face detection (2015). https://machinelearning.apple.com/2017/11/16/face-detection.html
Dapogny, A., Bailly, K., Dubuisson, S.: Pairwise conditional random forests for facial expression recognition. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3783–3791, December 2015. https://doi.org/10.1109/ICCV.2015.431
Dhall, A., Goecke, R., Lucey, S., Gedeon, T.: Static facial expression analysis in tough conditions: data, evaluation protocol and benchmark. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 2106–2112, November 2011. https://doi.org/10.1109/ICCVW.2011.6130508
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolutional networks. CoRR abs/1501.00092 (2015). http://arxiv.org/abs/1501.00092
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016). http://www.deeplearningbook.org
Goodfellow, I.J., Erhan, D., Carrier, P.L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.H., Zhou, Y., Ramaiah, C., Feng, F., Li, R., Wang, X., Athanasakis, D., Shawe-Taylor, J., Milakov, M., Park, J., Ionescu, R., Popescu, M., Grozea, C., Bergstra, J., Xie, J., Romaszko, L., Xu, B., Chuang, Z., Bengio, Y.: Challenges in representation learning: A report on three machine learning contests. Neural Netw. 64, 59–63 (2015). https://doi.org/10.1016/j.neunet.2014.09.005, http://www.sciencedirect.com/science/article/pii/S0893608014002159, special Issue on “Deep Learning of Representations”
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017). http://arxiv.org/abs/1704.04861
Huang, D., Shan, C., Ardabilian, M., Wang, Y., Chen, L.: Local binary patterns and its application to facial image analysis: a survey. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 41(6), 765–781 (2011). https://doi.org/10.1109/TSMCC.2011.2118750
Itseez: Open source computer vision library (2015). https://github.com/itseez/opencv
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of gans for improved quality, stability, and variation. CoRR abs/1710.10196 (2017). http://arxiv.org/abs/1710.10196
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression
Google Scholar
Lyons, M., Akamatsu, S., Kamachi, M., Gyoba, J.: Coding facial expressions with gabor wavelets. In: Proceedings of the 3rd. International Conference on Face & Gesture Recognition, FG 1998, pp. 200–205. IEEE Computer Society, Washington, DC (1998). http://dl.acm.org/citation.cfm?id=520809.796143
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic differentiation in pytorch (2017)
Google Scholar
Sandler, M., Howard, A.G., Zhu, M., Zhmoginov, A., Chen, L.: Inverted residuals and linear bottlenecks: mobile networks for classification, detection and segmentation. CoRR abs/1801.04381 (2018). http://arxiv.org/abs/1801.04381
Shan, C., Gong, S., McOwan, P.W.: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput. 27(6), 803–816 (2009). https://doi.org/10.1016/j.imavis.2008.08.005
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S.E., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. CoRR abs/1409.4842 (2014). http://arxiv.org/abs/1409.4842
Yu, Z., Zhang, C.: Image based static facial expression recognition with multiple deep network learning. IEEE - Institute of Electrical and Electronics Engineers, November 2015, https://www.microsoft.com/en-us/research/publication/image-based-static-facial-expression-recognition-with-multiple-deep-network-learning/
Zakai, A.: Emscripten: an LLVM-to-Javascript compiler, October 2011
Google Scholar

Download references

Author information

Authors and Affiliations

DeepAR LTD., London, EC1V 8AB, UK
Nicolas Livet & George Berkowski

Authors

Nicolas Livet
View author publications
You can also search for this author in PubMed Google Scholar
George Berkowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nicolas Livet .

Editor information

Editors and Affiliations

UIB – Universitat de les Illes Balears, Palma de Mallorca, Spain
Francisco José Perales
University of Surrey, Guildford, United Kingdom
Josef Kittler

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Livet, N., Berkowski, G. (2018). Shape and Appearance Based Sequenced Convnets to Detect Real-Time Face Attributes on Mobile Devices. In: Perales, F., Kittler, J. (eds) Articulated Motion and Deformable Objects. AMDO 2018. Lecture Notes in Computer Science(), vol 10945. Springer, Cham. https://doi.org/10.1007/978-3-319-94544-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-94544-6_8
Published: 17 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-94543-9
Online ISBN: 978-3-319-94544-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics