Emotion Recognition of a Group of People in Video Analytics Using Deep Off-the-Shelf Image Embeddings

Tarasov, Alexander V.; Savchenko, Andrey V.

doi:10.1007/978-3-030-11027-7_19

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11179))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

947 Accesses
5 Citations
22 Altmetric

Abstract

In this paper we address the group-level emotion classification problem in video analytic systems. We propose to apply the MTCNN face detector to obtain facial regions on each video frame. Next, off-the-shelf image features are extracted from each located face using preliminary trained convolutional neural networks. The features of the whole frame are computed as a mean average of image embeddings of individual faces. The resulted frame features are recognized with an ensemble of state-of-the-art classifiers computed as a weighted sum of their outputs. Experimental results with EmotiW 2017 dataset demonstrate that the proposed approach is 2–20% more accurate when compared to the conventional group-level emotion classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Krakovsky, M.: Artificial (emotional) intelligence. Commun. ACM 61(4), 18–19 (2018)
Article Google Scholar
Dhall, A., Goecke, R., Ghosh, S., Joshi, J., Hoey, J., Gedeon, T.: From individual to group-level emotion recognition: EmotiW 5.0. In: 19th ACM International Conference on Multimodal Interaction (ICMI), pp. 524–528. ACM (2017)
Google Scholar
Vielzeuf, V., Pateux, S., Jurie, F.: Temporal multimodal fusion for video emotion classification in the wild. In: 19th ACM International Conference on Multimodal Interaction (ICMI), pp. 569–576. ACM (2017)
Google Scholar
Fan, Y., Lu, X., Li, D., Liu Y.: Video-based emotion recognition using CNN-RNN and C3D hybrid networks. In: 18th ACM International Conference on Multimodal Interaction (ICMI), pp. 445–450. ACM (2016)
Google Scholar
Surace, L., Patacchiola, M., Sönmez, E.B., Spataro, W., Cangelosi, A.: Emotion recognition in the wild using deep neural networks and Bayesian classifiers. In: 19th ACM International Conference on Multimodal Interaction (ICMI), pp. 593–597. ACM (2017)
Google Scholar
Kaya, H., Gürpınar, F., Salah, A.A.: Video-based emotion recognition in the wild using deep transfer learning and score fusion. Image Vis. Comput. 65, 66–75 (2017)
Article Google Scholar
Rassadin, A., Gruzdev, A., Savchenko, A.: Group-level emotion recognition using transfer learning from face identification. In: 19th ACM International Conference on Multimodal Interaction (ICMI), pp. 544–548. ACM (2017)
Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar
Savchenko, A.V., Belova, N.S., Savchenko, L.V.: Fuzzy analysis and deep convolution neural networks in still-to-video recognition. Opt. Mem. Neural Netw. (Inf. Opt.) 27(1), 23–31 (2018)
Article Google Scholar
Hu, P., Ramanan, D.: Finding tiny faces. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1522–1530. IEEE (2017)
Google Scholar
Arriaga, O., Valdenegro-Toro, M., Plöger, P.: Real-time convolutional neural networks for emotion and gender classification. arXiv preprint arXiv:1710.07557 (2017)
Guo, X., Polania, L., Barner, K.: Group-level emotion recognition using deep models on image scene, faces, and skeletons. In: 19th ACM International Conference on Multimodal Interaction (ICMI), pp. 603–608. ACM (2017)
Google Scholar
Rassadin, A.G., Savchenko, A.V.: Compressing deep convolutional neural networks in visual emotion recognition. In: Proceedings of the International Conference on Information Technology and Nanotechnology (ITNT). Session Image Processing, Geoinformation Technology and Information Security Image Processing (IPGTIS), CEUR-WS, vol. 1901, pp. 207–213 (2017)
Google Scholar

Download references

Acknowledgements

The article was prepared within the framework of the Academic Fund Program at the National Research University Higher School of Economics (HSE) in 2017 (grant №17-05-0007) and by the Russian Academic Excellence Project “5-100”.

Author information

Authors and Affiliations

National Research University Higher School of Economics, Nizhny Novgorod, Russia
Alexander V. Tarasov & Andrey V. Savchenko

Authors

Alexander V. Tarasov
View author publications
You can also search for this author in PubMed Google Scholar
Andrey V. Savchenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander V. Tarasov .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
University of Mannheim, Mannheim, Germany
Goran Glavaš
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics , Saint Petersburg, Russia
Olessia Koltsova
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
Loria, Vandoeuvre lès Nancy, France
Amedeo Napoli
University of Hamburg, Hamburg, Germany
Alexander Panchenko
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Ca Foscari University of Venice, Venice, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tarasov, A.V., Savchenko, A.V. (2018). Emotion Recognition of a Group of People in Video Analytics Using Deep Off-the-Shelf Image Embeddings. In: van der Aalst, W., et al. Analysis of Images, Social Networks and Texts. AIST 2018. Lecture Notes in Computer Science(), vol 11179. Springer, Cham. https://doi.org/10.1007/978-3-030-11027-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-11027-7_19
Published: 31 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11026-0
Online ISBN: 978-3-030-11027-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics