Improving Human Emotion Recognition from Emotive Videos Using Geometric Data Augmentation

Shoumy, Nusrat J.; Ang, Li-Minn; Rahaman, D. M. Motiur; Zia, Tanveer; Seng, Kah Phooi; Khatun, Sabira

doi:10.1007/978-3-030-79463-7_13

Nusrat J. Shoumy¹²,
Li-Minn Ang¹³,
D. M. Motiur Rahaman¹²,
Tanveer Zia¹²,
Kah Phooi Seng¹⁴ &
…
Sabira Khatun¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12799))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1190 Accesses

Abstract

Emotional recognition from videos or images requires large amount of data to obtain high performance and classification accuracy. However, large datasets are not always easily available. A good solution to this problem is to augment the data and extrapolate it to create a bigger dataset for training the classifier. In this paper, we evaluate the impact of different geometric data augmentation (GDA) techniques on emotion recognition accuracy using facial image data. The GDA techniques that were implemented were horizontal reflection, cropping, rotation separately and combined. In addition to this, our system was further evaluated with four different classifiers (Convolutional Neural Network (CNN), Linear Discriminant Analysis (LDA), K-Nearest Neighbor (kNN) and Decision Tree (DT)) to determine which of the four classifiers achieves the best results. In the proposed system, we used augmented data from a dataset (SAVEE) to perform training, and testing was carried out by the original data. A combination of GDA techniques using the CNN classifier was found to give the best performance of approximately 97.8%. Our system with GDA augmentation was shown to outperform previous approaches where only the original dataset was used for classifier training.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kleinsmith, A., Bianchi-Berthouze, N.: Affective body expression perception and recognition: a survey. IEEE Trans. Affect. Comput. 4, 15–33 (2013). https://doi.org/10.1109/T-AFFC.2012.16
Article Google Scholar
Porcu, S., Floris, A., Atzori, L.: Evaluation of data augmentation techniques for facial expression recognition systems. Electronics 9(11), 1892 (2020). https://doi.org/10.3390/electronics9111892
Article Google Scholar
Ekman, P., Friesen, W.V.: Constants across cultures in the face and emotion. J. Pers. Soc. Psychol. 17, 124–129 (1971). https://doi.org/10.1037/h0030377
Article Google Scholar
Frijda, N.H., Mesquita, B.: The analysis of emotions. In: Mascolo, M.F., Griffin, S. (eds.) What Develops in Emotional Development?, pp. 273–295. Springer US, Boston (1998). https://doi.org/10.1007/978-1-4899-1939-7_11
Chapter Google Scholar
Mehrabian, A.: Comparison of the PAD and PANAS as models for describing emotions and for differentiating anxiety from depression. J. Psychopathol. Behav. Assess. 19, 331–357 (1997). https://doi.org/10.1007/BF02229025
Article Google Scholar
Mehrabian, A.: Pleasure-arousal-dominance: a general framework for describing and measuring individual differences in temperament. Curr. Psychol. 14, 261–292 (1996). https://doi.org/10.1007/BF02686918
Article MathSciNet Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0
Article Google Scholar
Gavali, P., Banu, J.S.: Deep convolutional neural network for image classification on CUDA platform. In: Deep Learning and Parallel Computing Environment for Bioengineering Systems, pp. 99–122. Elsevier (2019). https://doi.org/10.1016/B978-0-12-816718-2.00013-0
Shoumy, N.J., Ang, L.M., Seng, K.P., Rahaman, D.M.M., Zia, T.: Multimodal big data affective analytics: a comprehensive survey using text, audio, visual and physiological signals. J. Netw. Comput. Appl. 149, 1–24 (2020). https://doi.org/10.1016/j.jnca.2019.102447
Article Google Scholar
Setchi, R., Asikhia, O.K.: Exploring user experience with image schemas, sentiments, and semantics. IEEE Trans. Affect. Comput. 10(2), 182–195 (2019). https://doi.org/10.1109/TAFFC.2017.2705691
Article Google Scholar
Bartlett, M.S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing facial expression: machine learning and application to spontaneous behavior. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 568–573 (2005). https://doi.org/10.1109/CVPR.2005.297
Dapogny, A., Bailly, K., Dubuisson, S.: Dynamic pose-robust facial expression recognition by multi-view pairwise conditional random forests, vol. 3045, pp. 1–14 (2016). https://doi.org/10.1109/ICCV.2015.431
Wang, S.-J., Chen, H.-L., Yan, W.-J., Chen, Y.-H., Fu, X.: Face recognition and micro-expression recognition based on discriminant tensor subspace analysis plus extreme learning machine. Neural Process. Lett. 39(1), 25–43 (2013). https://doi.org/10.1007/s11063-013-9288-7
Article Google Scholar
Chen, J., Chen, Z., Chi, Z., Fu, H.: Facial expression recognition in video with multiple feature fusion. IEEE Trans. Affect. Comput. 3045, 1 (2016). https://doi.org/10.1109/TAFFC.2016.2593719
Article Google Scholar
Xu, B., Fu, Y., Jiang, Y.-G., Li, B., Sigal, L.: Heterogeneous knowledge transfer in video emotion recognition, attribution and summarization. IEEE Trans. Affect. Comput. 3045, 1–13 (2015). https://doi.org/10.1109/TAFFC.2016.2622690
Article Google Scholar
Zhu, Y., Shang, Y., Shao, Z., Guo, G.: Automated depression diagnosis based on deep networks to encode facial appearance and dynamics. IEEE Trans. Affect. Comput. 9(4), 578–584 (2018). https://doi.org/10.1109/TAFFC.2017.2650899
Article Google Scholar
Kaya, H., Gürpınar, F., Salah, A.A.: Video-based emotion recognition in the wild using deep transfer learning and score fusion. Image Vis. Comput. 65, 66–75 (2017). https://doi.org/10.1016/j.imavis.2017.01.012
Article Google Scholar
Raheel, A., Majid, M., Anwar, S.M.: Facial expression recognition based on electroencephalography. In: 2019 Proceedings of the 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET), pp. 1–5. IEEE (2019). https://doi.org/10.1109/ICOMET.2019.8673408
Zamil, A.A.A., Hasan, S., Jannatul Baki, S.M., Adam, J.M., Zaman, I.: Emotion detection from speech signals using voting mechanism on classified frames. In: Proceedings of the 1st International Conference on Robotics, Electrical and Signal Processing Techniques, ICREST 2019, pp. 281–285. IEEE (2019). https://doi.org/10.1109/ICREST.2019.8644168
Lingampeta, D., Yalamanchili, B.: Human emotion recognition using acoustic features with optimized feature selection and fusion techniques. In: Proceedings of the 5th International Conference on Inventive Computation Technologies, ICICT 2020, pp. 221–225 (2020). https://doi.org/10.1109/ICICT48043.2020.9112452
Ahmed, T.U., Hossain, S., Hossain, M.S., ul Islam, R., Andersson, K.: Facial expression recognition using convolutional neural network with data augmentation. In: Proceedings of the Joint 8th International Conference on Informatics, Electronics and Vision (ICIEV 2019) and Proceedings of the 3rd International Conference on Imaging, Vision and Pattern Recognition (icIVPR 2019), pp. 336–341. IEEE (2019). https://doi.org/10.1109/ICIEV.2019.8858529
Salama, E.S., El-Khoribi, R.A., Shoman, M.E., Wahby Shalaby, M.A.: A 3D-convolutional neural network framework with ensemble learning techniques for multi-modal emotion recognition. Egypt. Inform. J. (2020). https://doi.org/10.1016/j.eij.2020.07.005.
Cho, Y., Bianchi-Berthouze, N., Julier, S.J.: DeepBreath: deep learning of breathing patterns for automatic stress recognition using low-cost thermal imaging in unconstrained settings. In: Proceedings of the 7th International Conference on Affective Computing and Intelligent Interaction, ACII 2017, pp. 456–463 (2018). https://doi.org/10.1109/ACII.2017.8273639
Pitaloka, D.A., Wulandari, A., Basaruddin, T., Liliana, D.Y.: Enhancing CNN with preprocessing stage in automatic emotion recognition. Procedia Comput. Sci. 116, 523–529 (2017). https://doi.org/10.1016/j.procs.2017.10.038
Article Google Scholar
Baltrusaitis, T., Zadeh, A., Lim, Y.C., Morency, L.P.: OpenFace 2.0: facial behavior analysis toolkit. In: Proceedings of the 13th IEEE International Conference on Automation Face Gesture Recognition, FG 2018, pp. 59–66 (2018). https://doi.org/10.1109/FG.2018.00019
Pampouchidou, A., et al.: Quantitative comparison of motion history image variants for video-based depression assessment. EURASIP J. Image Video Process. 2017(1), 1–11 (2017). https://doi.org/10.1186/s13640-017-0212-3
Article Google Scholar
Silva, C., Sobral, A., Vieira, R.T.: An automatic facial expression recognition system evaluated by different classifier. In: X Workshop de Visão Computacional (WVC 2014) (2014). https://doi.org/10.13140/2.1.2789.2801
Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. In: Buxton, B., Cipolla, R. (eds.) Computer Vision—ECCV ‘96: 4th European Conference on Computer Vision Cambridge, UK, April 15–18, 1996 Proceedings, Volume I, pp. 43–58. Springer Berlin Heidelberg, Berlin, Heidelberg (1996). https://doi.org/10.1007/BFb0015522
Chapter Google Scholar
Haq, S., Jackson, P.J.B.: Speaker-dependent audio-visual emotion recognition. In: Proceedings of the VSP 2009—International Conference of Audio-Visual Speech Processing University of East Anglia, Norwich, UK, 10–13 September 2009, pp. 1–6 (2009)
Google Scholar
Nguyen, D., Sridharan, S., Nguyen, D.T., Denman, S., Dean, D., Fookes, C.: Meta transfer learning for emotion recognition. arXiv (2020)
Google Scholar
Avots, E., Sapiński, T., Bachmann, M., Kamińska, D.: Audiovisual emotion recognition in wild. Mach. Vis. Appl. 30(5), 975–985 (2018). https://doi.org/10.1007/s00138-018-0960-9
Article Google Scholar
Chew, W.J., Seng, K.P., Ang, L.M.: Nose tip detection on a three-dimensional face range image invariant to head pose. In: Proceedings of the International Multiconference of Engineers and Computer Scientists, vol. 1, pp. 18–20 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing and Mathematics, Charles Sturt University, Wagga, NSW, Australia
Nusrat J. Shoumy, D. M. Motiur Rahaman & Tanveer Zia
School of Science and Engineering, University of the Sunshine Coast, Sippy Downs, QLD, Australia
Li-Minn Ang
School of Engineering and IT, University of New South Wales, Canberra, Australia
Kah Phooi Seng
Faculty of Electrical and Electronics Engineering, Universiti Malaysia Pahang, Pekan, Pahang, Malaysia
Sabira Khatun

Authors

Nusrat J. Shoumy
View author publications
You can also search for this author in PubMed Google Scholar
Li-Minn Ang
View author publications
You can also search for this author in PubMed Google Scholar
D. M. Motiur Rahaman
View author publications
You can also search for this author in PubMed Google Scholar
Tanveer Zia
View author publications
You can also search for this author in PubMed Google Scholar
Kah Phooi Seng
View author publications
You can also search for this author in PubMed Google Scholar
Sabira Khatun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nusrat J. Shoumy .

Editor information

Editors and Affiliations

i-SOMET Incorporate Association, Morioka, Japan
Hamido Fujita
Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
Ali Selamat
Western Norway University of Applied Sciences, Bergen, Norway
Jerry Chun-Wei Lin
Texas State University San Marcos, San Marcos, TX, USA
Moonis Ali

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shoumy, N.J., Ang, LM., Rahaman, D.M.M., Zia, T., Seng, K.P., Khatun, S. (2021). Improving Human Emotion Recognition from Emotive Videos Using Geometric Data Augmentation. In: Fujita, H., Selamat, A., Lin, J.CW., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2021. Lecture Notes in Computer Science(), vol 12799. Springer, Cham. https://doi.org/10.1007/978-3-030-79463-7_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-79463-7_13
Published: 19 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79462-0
Online ISBN: 978-3-030-79463-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics