A Machine Learning Approach to Detect Violent Behaviour from Video

Nova, David; Ferreira, André; Cortez, Paulo

doi:10.1007/978-3-030-16447-8_9

David Nova²⁰,
André Ferreira²¹ &
Paulo Cortez ORCID: orcid.org/0000-0002-7991-2090²⁰

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 273))

Included in the following conference series:

International Conference on Intelligent Technologies for Interactive Entertainment

776 Accesses
6 Citations
10 Altmetric

Abstract

The automatic classification of violent actions performed by two or more persons is an important task for both societal and scientific purposes. In this paper, we propose a machine learning approach, based a Support Vector Machine (SVM), to detect if a human action, captured on a video, is or not violent. Using a pose estimation algorithm, we focus mostly on feature engineering, to generate the SVM inputs. In particular, we hand-engineered a set of input features based on keypoints (angles, velocity and contact detection) and used them, under distinct combinations, to study their effect on violent behavior recognition from video. Overall, an excellent classification was achieved by the best performing SVM model, which used keypoints, angles and contact features computed over a 60 frame image input range.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 60.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Afsar, P., Cortez, P., Santos, H.: Automatic visual detection of human behavior: a review from 2000 to 2014. Expert Syst. Appl. 42(20), 6935–6956 (2015). https://doi.org/10.1016/j.eswa.2015.05.023
Article Google Scholar
Afsar, P., Cortez, P., Santos, H.M.D.: Automatic human trajectory destination prediction from video. Expert Syst. Appl. 110, 41–51 (2018). https://doi.org/10.1016/j.eswa.2018.03.035
Article Google Scholar
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)
Google Scholar
Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. CoRR abs/1506.03607 (2015). http://arxiv.org/abs/1506.03607
Clarin, C.T., Dionisio, J.A.M., Echavez, M.T., Naval, P.C.: DOVE: detection of movie violence using motion intensity analysis on skin and blood. Technical report, University of the Philippines (2005)
Google Scholar
Coppola, C., Faria, D., Nunes, U., Bellotto, N.: Social activity recognition based on probabilistic merging of skeleton features with proximity priors from RGB-D data. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5055–5061 (2016)
Google Scholar
Datta, A., Shah, M., Lobo, N.D.V.: Person-on-person violence detection in video data. In: Object Recognition Supported by User Interaction for Service Robots, vol. 1, pp. 433–438, August 2002. https://doi.org/10.1109/ICPR.2002.1044748
Deniz, O., Serrano, I., Bueno, G., Kim, T.: Fast violence detection in video. In: 2014 International Conference on Computer Vision Theory and Applications (VISAPP), vol. 2, pp. 478–485, January 2014
Google Scholar
Dong, Z., Qin, J., Wang, Y.: Multi-stream deep networks for person to person violence detection in videos. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds.) Pattern Recognition, pp. 517–531. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-3002-4_43
Chapter Google Scholar
Du, W., Wang, Y., Qiao, Y.: RPAN: an end-to-end recurrent pose-attention network for action recognition in videos. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3745–3754, October 2017. https://doi.org/10.1109/ICCV.2017.402
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR abs/1512.03385 (2015). http://arxiv.org/abs/1512.03385
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. CoRR abs/1404.7584 (2014). http://arxiv.org/abs/1404.7584
Herath, S., Harandi, M.T., Porikli, F.: Going deeper into action recognition: a survey. CoRR abs/1605.04988 (2016). http://arxiv.org/abs/1605.04988
Kong, Y., Fu, Y.: Human Action Recognition and Prediction: A Survey. ArXiv e-prints, June 2018
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
Li, L., Zheng, W., Zhang, Z., Huang, Y., Wang, L.: Skeleton-based relational modeling for action recognition. CoRR abs/1805.02556 (2018). http://arxiv.org/abs/1805.02556
Liu, J., Shahroudy, A., Xu, D., Wang, G.: Spatio-temporal LSTM with trust gates for 3D human action recognition. CoRR abs/1607.07043 (2016). http://arxiv.org/abs/1607.07043
Luvizon, D.C., Picard, D., Tabia, H.: 2D/3D pose estimation and action recognition using multitask deep learning. CoRR abs/1802.09232 (2018). http://arxiv.org/abs/1802.09232
Ng, A.: Machine Learning Yearning. deeplearning.ai (2018)
Google Scholar
Pham, H., Khoudour, L., Crouzil, A., Zegers, P., Velastin, S.A.: Exploiting deep residual networks for human action recognition from skeletal data. CoRR abs/1803.07781 (2018). http://arxiv.org/abs/1803.07781
Pham, H., Khoudour, L., Crouzil, A., Zegers, P., Velastin, S.A.: Learning and recognizing human action from skeleton movement with deep residual neural networks. CoRR abs/1803.07780 (2018). http://arxiv.org/abs/1803.07780
Sudhakaran, S., Lanz, O.: Learning to detect violent videos using convolutional long short-term memory. CoRR abs/1709.06531 (2017). http://arxiv.org/abs/1709.06531
Vasconcelos, N., Lippman, A.: Towards semantically meaningful feature spaces for the characterization of video content. In: Proceedings of International Conference on Image Processing, vol. 1, pp. 25–28, October 1997. https://doi.org/10.1109/ICIP.1997.647375
Wang, Q.: A survey of visual analysis of human motion and its applications. CoRR abs/1608.00700 (2016). http://arxiv.org/abs/1608.00700
Witten, I., Frank, E., Hall, M., Pal, C.: Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann, San Franscico (2017)
Google Scholar
Zolfaghari, M., Oliveira, G.L., Sedaghat, N., Brox, T.: Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection. CoRR abs/1704.00616 (2017). http://arxiv.org/abs/1704.00616

Download references

Acknowledgments

The work of P. Cortez was supported by Fundação para a Ciência e Tecnologia (FCT) within the Project Scope: UID/CEC/00319/2013.

Author information

Authors and Affiliations

ALGORITMI Centre, Department of Information Systems, University of Minho, 4804-533, Guimarães, Portugal
David Nova & Paulo Cortez
Department of Informatics, University of Minho, 4710-057, Braga, Portugal
André Ferreira

Authors

David Nova
View author publications
You can also search for this author in PubMed Google Scholar
André Ferreira
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Cortez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paulo Cortez .

Editor information

Editors and Affiliations

Department de Sistemas de Informacao, Universidade do Minho, Guimaraes, Portugal
Paulo Cortez
Department of Information Systems, University of Minho, Guimarães, Portugal
Luís Magalhães
University of Minho, Guimarães, Portugal
Pedro Branco
Department of Information Systems, University of Minho, Guimarães, Portugal
Carlos Filipe Portela
Department of Engineering, University of Trás-os-Montes e Alto Douro, Vila Real, Portugal
Telmo Adão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nova, D., Ferreira, A., Cortez, P. (2019). A Machine Learning Approach to Detect Violent Behaviour from Video. In: Cortez, P., Magalhães, L., Branco, P., Portela, C., Adão, T. (eds) Intelligent Technologies for Interactive Entertainment. INTETAIN 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 273. Springer, Cham. https://doi.org/10.1007/978-3-030-16447-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-16447-8_9
Published: 31 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16446-1
Online ISBN: 978-3-030-16447-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics