Anomalous activity recognition deals with identifying the patterns and events that vary from the normal stream. In a surveillance paradigm, these events range from abuse to fighting and road accidents to snatching, etc. Due to the sparse occurrence of anomalous events, anomalous activity recognition from surveillance videos is a challenging research task. The approaches reported can be generally categorized as handcrafted and deep learning-based. Most of the reported studies address binary classification i.e. anomaly detection from surveillance videos. But these reported approaches did not address other anomalous events e.g. abuse, fight, road accidents, shooting, stealing, vandalism, and robbery, etc. from surveillance videos. Therefore, this paper aims to provide an effective framework for the recognition of different real-world anomalies from videos. This study provides a simple, yet effective approach for learning spatiotemporal features using deep 3-dimensional convolutional networks (3D ConvNets) trained on the University of Central Florida (UCF) Crime video dataset. Firstly, the frame-level labels of the UCF Crime dataset are provided, and then to extract anomalous spatiotemporal features more efficiently a fine-tuned 3D ConvNets is proposed. Findings of the proposed study are twofold 1) There exist specific, detectable, and quantifiable features in UCF Crime video feed that associate with each other 2) Multiclass learning can improve generalizing competencies of the 3D ConvNets by effectively learning frame-level information of dataset and can be leveraged in terms of better results by applying spatial augmentation. The proposed study extracted 3D features by providing frame level information and spatial augmentation to a fine-tuned pre-trained model, namely 3DConvNets. Besides, the learned features are compact enough and the proposed approach outperforms significantly from state of art approaches in terms of accuracy on anomalous activity recognition having 82% AUC.
This is a preview of subscription content, access via your institution.
Buy single article
Instant access to the full article PDF.
Tax calculation will be finalised during checkout.
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
Tax calculation will be finalised during checkout.
No avalaibility of data
Andrei Z, Richard W (2020) Anomalous behavior data set. Department of Computer Science and Engineering and Centre for Vision Research York University, Toronto, ON, Canada, [Online]. Available: http://vision.eecs.yorku.ca/research/anomalous-behaviour-data/. [Accessed 27 September 2020].
Bansod S, Nandedhak A (2019) Transfer learning for video anomaly detection. J Intell Fuzzy Syst 36(3):1967–1975
Cai W, Zhango W (2010) PiiGAN: Generative adversial networks for pluralistic image inpainting. IEEE Access Remote sensing image recognition 8:48451–48463
Chong YS, Tay YH (2017) Abnormal event detection in videos using spatiotemporal autoencoder, in In Advances in Neural Networks - ISNN 2017 14th International Symposium, Sapporo. Springer, Hakodate, and Muroran, pp 189–196
Colque R, Caetano C, de Andrade M, Schwartz WR (2016) Histograms of optical flow orientation and magnitude and entropy to detect anomalous events in videos. IEEE Transactions on Circuits and Systems for Video Technology 27(3):673–682
Colque R, Caetano C, Andrade M, Schwartz W (2017) Histograms of optical flow orientation and magnitude and entropy to detect anomalous events in videos. IEEE Transactions on Circuits and Systems for Video Technology 27(3):673–682
Cui X, Geol V, Kingsbury B (2015) Data augmentation for deep neural network acoustic modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23(9):1469–1477
Farooq M, Khan N, Ali M (2017) Unsupervised video surveillance for anomaly detection of street traffic. International Journal of Advanced Computer Science and Applications (IJACSA) 12(8):270–275
Gao H, Cheng B, Wang J, Li K, Zhao J, Li D (2018) Objeobject classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Transactions on Industrial Informatics 14(9):4224–4231
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press
Huynh-The T, Hua-Cam H, Kim DS (April 2019) Encoding pose features to images with data augmentation for 3D Action recognition. IEEE Transactions on Industrial Informatics 16(5):3100–3111
Jamadandi A, Kotturshettar S, Mudenagudi U (2020) Two stream convolutional neural networks for anomaly detection in surveillance videos. In: Smart Computing Paradigms: New Progresses and Challenges. Springer, pp 41–48
Kim B, Lee J (2018) A deep-learning based model for emotional evaluation of video clips. International Journal of Fuzzy Logic and Intelligent Systems 18(4):245–253
Koppikar U, Sujatha C, Patil P, Mudenagudi U (2019) Real-world anomaly detection using deep learning. In: International Conference on Intelligent Computing and Communication. Springer, pp 333–342
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Li Z, Li Y, Gao Z (2020) Spatiotemporal representation learning for video anomaly detection. IEEE Access 8:25531–25542
Lou H, Xiong C, Fang W, Love PE, Zhang B, Ouyang X (2018) Convolutional neural networks: Computer vision-based workforce activity assessment in construction. Autom Constr 94:282–289
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1975-1981.
Mohammadi S, Kiani H, Perina A, Murino V (2015) Violence detection in crowded scenes using substantial derivative, in 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, AVSS, pp 1–6
Narkhede S (2018) Understanding AUC-ROC curve. Towards Data Science 26:220–227
Sabokrou M, Fayyaz M, Fathy M, Klette R (2017) Cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans Image Process 26(4):1992–2004
Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97
Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Fully convolutional neural network for fast anomaly detection in crowded scene. Comput Vis Image Underst 172:88–97
Shah AP, Lamare JB, Nguyen-Anh T, Hauptmann A (2018) CADP: A novel dataset for CCTV traffic camera based accident analysis, in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, AVSS, pp 1–9
Sigh D, Mohan CK (2018) Deep spatio-temporal representation for detection of road accidents using stacked autoencoder. IEEE Trans Intell Transp Syst 20(3):879–887
Sigurdsson G, Russakovsky O, Gupta A (2017) What actions are needed for understanding human actions in videos? in Proceedings of the IEEE international conference on computer vision, pp. 2137-2146.
Sodemann AA, Ross MP, Borghetti BJ (2012) A review of anomaly detection in automated surveillance. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):1257–1272
Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6479-6488.
Sun L, Jia K, Yeung DY, Shi BE (2015) Human action recognition using factorized spatio-temporal convolutional networks. in Proceedings of the IEEE international conference on computer vision, ICCV, pp. 4597-4605.
SVCL (2013) UCSD anomaly detection dataset. Svcl. [Online]. Available: http://www.svcl.ucsd.edu/projects/anomaly/dataset.html. [Accessed 20 April 2020].
Tian Y, Dehghan A, Shah M (2018) On detection, data association and segmentation for multi-target tracking. IEEE Transaction on patren analysis and machine inteligence 41(9):2146–2160
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. in Proceedings of the IEEE international conference on computer vision, ICCV:4489–4497
Um TT, Pfister FM, Pichler DE, Satoshi LM, Hirche SF, Urban KD (2017) Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks, in Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 216-220.
University of Central Florida (2011) Real-world anomaly detection in surveillance videos. CVCR. [Online]. Available: https://www.crcv.ucf.edu/projects/real-world/. [Accessed 20 April 2020].
University of Central Florida (2020) Abnormal crowd behavior detection using social force model," CVCR, 2011. [Online]. Available: https://www.crcv.ucf.edu/projects/Abnormal_Crowd/. [Accessed 20 April 2020].
Varghese E, Thampi SM (2018) A deep learning approach to predict crowd behavior based on emotion," in International Conference on Smart Multimedia, Springer, pp. 296--307.
Vilamala MR, Hiley L, Hicks YP, Alun CF (2019) A pilot study on detecting violence in Videos Fusing Proxy Models, vilamala2019pilot
Vishnu VM, Rajalakshmi M, Nedunchezhian R (2018) Intelligent traffic video surveillance and accident detection system with dynamic traffic signal control. Clust Comput 21(1):135–147
Yang Z-L, Guo X-Q, Chen Z-M, Huang Y-F, Zhang Y-J (2018) RNN-stega: Linguistic stenography based on recurrent neural networks. IEEE Transaction on Information Forensics and Security 14(5):1280–1295
You H, Tian S, Yu L (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293
Yu SC, Yun S, Songzhi C, Guorong LS (2017) Stratified pooling based deep convolutional neural networks for human action recognition. Multimed Tools Appl 76(11):13367–13382
Zhang T, Yang Z, Jia W, Yang B, Yang J, He X (2016) A new method for violence detection in surveillance scenes. Multimed Tools Appl 75(12):7327–7349
Zhang LZ, Guangming S, Peiyi S, Juan AS, Bennamoun M (2017) Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition. in Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops.
Zhu Y, Newsam S (2019) Motion-aware feature for improved video anomaly detection. in British Machine Vision Conference. BMVC.
We acknowledge partial support from the National Center of Big Data and Cloud Computing (NCBC) and HEC of Pakistan for conducting this research.
Conflicts of interest/competing interests
No conflict of interest
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Maqsood, R., Bajwa, U.I., Saleem, G. et al. Anomaly recognition from surveillance videos using 3D convolution neural network. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-021-10570-3
- Anomalous activity recognition
- Spatial augmentation
- Spatial annotation