Anomaly recognition from surveillance videos using 3D convolution neural network

Abstract

Anomalous activity recognition deals with identifying the patterns and events that vary from the normal stream. In a surveillance paradigm, these events range from abuse to fighting and road accidents to snatching, etc. Due to the sparse occurrence of anomalous events, anomalous activity recognition from surveillance videos is a challenging research task. The approaches reported can be generally categorized as handcrafted and deep learning-based. Most of the reported studies address binary classification i.e. anomaly detection from surveillance videos. But these reported approaches did not address other anomalous events e.g. abuse, fight, road accidents, shooting, stealing, vandalism, and robbery, etc. from surveillance videos. Therefore, this paper aims to provide an effective framework for the recognition of different real-world anomalies from videos. This study provides a simple, yet effective approach for learning spatiotemporal features using deep 3-dimensional convolutional networks (3D ConvNets) trained on the University of Central Florida (UCF) Crime video dataset. Firstly, the frame-level labels of the UCF Crime dataset are provided, and then to extract anomalous spatiotemporal features more efficiently a fine-tuned 3D ConvNets is proposed. Findings of the proposed study are twofold 1) There exist specific, detectable, and quantifiable features in UCF Crime video feed that associate with each other 2) Multiclass learning can improve generalizing competencies of the 3D ConvNets by effectively learning frame-level information of dataset and can be leveraged in terms of better results by applying spatial augmentation. The proposed study extracted 3D features by providing frame level information and spatial augmentation to a fine-tuned pre-trained model, namely 3DConvNets. Besides, the learned features are compact enough and the proposed approach outperforms significantly from state of art approaches in terms of accuracy on anomalous activity recognition having 82% AUC.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Data availability

No avalaibility of data

References

  1. 1.

    Andrei Z, Richard W (2020) Anomalous behavior data set. Department of Computer Science and Engineering and Centre for Vision Research York University, Toronto, ON, Canada, [Online]. Available: http://vision.eecs.yorku.ca/research/anomalous-behaviour-data/. [Accessed 27 September 2020].

  2. 2.

    Bansod S, Nandedhak A (2019) Transfer learning for video anomaly detection. J Intell Fuzzy Syst 36(3):1967–1975

    Article  Google Scholar 

  3. 3.

    Cai W, Zhango W (2010) PiiGAN: Generative adversial networks for pluralistic image inpainting. IEEE Access Remote sensing image recognition 8:48451–48463

    Google Scholar 

  4. 4.

    Chong YS, Tay YH (2017) Abnormal event detection in videos using spatiotemporal autoencoder, in In Advances in Neural Networks - ISNN 2017 14th International Symposium, Sapporo. Springer, Hakodate, and Muroran, pp 189–196

    Google Scholar 

  5. 5.

    Colque R, Caetano C, de Andrade M, Schwartz WR (2016) Histograms of optical flow orientation and magnitude and entropy to detect anomalous events in videos. IEEE Transactions on Circuits and Systems for Video Technology 27(3):673–682

    Article  Google Scholar 

  6. 6.

    Colque R, Caetano C, Andrade M, Schwartz W (2017) Histograms of optical flow orientation and magnitude and entropy to detect anomalous events in videos. IEEE Transactions on Circuits and Systems for Video Technology 27(3):673–682

    Article  Google Scholar 

  7. 7.

    Cui X, Geol V, Kingsbury B (2015) Data augmentation for deep neural network acoustic modeling. IEEE/ACM Transactions on Audio, Speech, and Language Processing 23(9):1469–1477

    Article  Google Scholar 

  8. 8.

    Farooq M, Khan N, Ali M (2017) Unsupervised video surveillance for anomaly detection of street traffic. International Journal of Advanced Computer Science and Applications (IJACSA) 12(8):270–275

    Google Scholar 

  9. 9.

    Gao H, Cheng B, Wang J, Li K, Zhao J, Li D (2018) Objeobject classification using CNN-based fusion of vision and LIDAR in autonomous vehicle environment. IEEE Transactions on Industrial Informatics 14(9):4224–4231

    Article  Google Scholar 

  10. 10.

    Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press

  11. 11.

    Huynh-The T, Hua-Cam H, Kim DS (April 2019) Encoding pose features to images with data augmentation for 3D Action recognition. IEEE Transactions on Industrial Informatics 16(5):3100–3111

    Article  Google Scholar 

  12. 12.

    Jamadandi A, Kotturshettar S, Mudenagudi U (2020) Two stream convolutional neural networks for anomaly detection in surveillance videos. In: Smart Computing Paradigms: New Progresses and Challenges. Springer, pp 41–48

  13. 13.

    Kim B, Lee J (2018) A deep-learning based model for emotional evaluation of video clips. International Journal of Fuzzy Logic and Intelligent Systems 18(4):245–253

    MathSciNet  Article  Google Scholar 

  14. 14.

    Koppikar U, Sujatha C, Patil P, Mudenagudi U (2019) Real-world anomaly detection using deep learning. In: International Conference on Intelligent Computing and Communication. Springer, pp 333–342

  15. 15.

    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  16. 16.

    Li Z, Li Y, Gao Z (2020) Spatiotemporal representation learning for video anomaly detection. IEEE Access 8:25531–25542

    Article  Google Scholar 

  17. 17.

    Lou H, Xiong C, Fang W, Love PE, Zhang B, Ouyang X (2018) Convolutional neural networks: Computer vision-based workforce activity assessment in construction. Autom Constr 94:282–289

    Article  Google Scholar 

  18. 18.

    Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes, in 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1975-1981.

  19. 19.

    Mohammadi S, Kiani H, Perina A, Murino V (2015) Violence detection in crowded scenes using substantial derivative, in 2015 12th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, AVSS, pp 1–6

    Google Scholar 

  20. 20.

    Narkhede S (2018) Understanding AUC-ROC curve. Towards Data Science 26:220–227

    Google Scholar 

  21. 21.

    Sabokrou M, Fayyaz M, Fathy M, Klette R (2017) Cascading 3d deep neural networks for fast anomaly detection and localization in crowded scenes. IEEE Trans Image Process 26(4):1992–2004

    MathSciNet  Article  Google Scholar 

  22. 22.

    Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes. Comput Vis Image Underst 172:88–97

    Article  Google Scholar 

  23. 23.

    Sabokrou M, Fayyaz M, Fathy M, Moayed Z, Klette R (2018) Fully convolutional neural network for fast anomaly detection in crowded scene. Comput Vis Image Underst 172:88–97

    Article  Google Scholar 

  24. 24.

    Shah AP, Lamare JB, Nguyen-Anh T, Hauptmann A (2018) CADP: A novel dataset for CCTV traffic camera based accident analysis, in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, AVSS, pp 1–9

    Google Scholar 

  25. 25.

    Sigh D, Mohan CK (2018) Deep spatio-temporal representation for detection of road accidents using stacked autoencoder. IEEE Trans Intell Transp Syst 20(3):879–887

    Google Scholar 

  26. 26.

    Sigurdsson G, Russakovsky O, Gupta A (2017) What actions are needed for understanding human actions in videos? in Proceedings of the IEEE international conference on computer vision, pp. 2137-2146.

  27. 27.

    Sodemann AA, Ross MP, Borghetti BJ (2012) A review of anomaly detection in automated surveillance. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):1257–1272

    Article  Google Scholar 

  28. 28.

    Sultani W, Chen C, Shah M (2018) Real-world anomaly detection in surveillance videos, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6479-6488.

  29. 29.

    Sun L, Jia K, Yeung DY, Shi BE (2015) Human action recognition using factorized spatio-temporal convolutional networks. in Proceedings of the IEEE international conference on computer vision, ICCV, pp. 4597-4605.

  30. 30.

    SVCL (2013) UCSD anomaly detection dataset. Svcl. [Online]. Available: http://www.svcl.ucsd.edu/projects/anomaly/dataset.html. [Accessed 20 April 2020].

  31. 31.

    Tian Y, Dehghan A, Shah M (2018) On detection, data association and segmentation for multi-target tracking. IEEE Transaction on patren analysis and machine inteligence 41(9):2146–2160

    Article  Google Scholar 

  32. 32.

    Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. in Proceedings of the IEEE international conference on computer vision, ICCV:4489–4497

  33. 33.

    Um TT, Pfister FM, Pichler DE, Satoshi LM, Hirche SF, Urban KD (2017) Data augmentation of wearable sensor data for Parkinson’s disease monitoring using convolutional neural networks, in Proceedings of the 19th ACM International Conference on Multimodal Interaction, pp. 216-220.

  34. 34.

    University of Central Florida (2011) Real-world anomaly detection in surveillance videos. CVCR. [Online]. Available: https://www.crcv.ucf.edu/projects/real-world/. [Accessed 20 April 2020].

  35. 35.

    University of Central Florida (2020) Abnormal crowd behavior detection using social force model," CVCR, 2011. [Online]. Available: https://www.crcv.ucf.edu/projects/Abnormal_Crowd/. [Accessed 20 April 2020].

  36. 36.

    Varghese E, Thampi SM (2018) A deep learning approach to predict crowd behavior based on emotion," in International Conference on Smart Multimedia, Springer, pp. 296--307.

  37. 37.

    Vilamala MR, Hiley L, Hicks YP, Alun CF (2019) A pilot study on detecting violence in Videos Fusing Proxy Models, vilamala2019pilot

  38. 38.

    Vishnu VM, Rajalakshmi M, Nedunchezhian R (2018) Intelligent traffic video surveillance and accident detection system with dynamic traffic signal control. Clust Comput 21(1):135–147

    Article  Google Scholar 

  39. 39.

    Yang Z-L, Guo X-Q, Chen Z-M, Huang Y-F, Zhang Y-J (2018) RNN-stega: Linguistic stenography based on recurrent neural networks. IEEE Transaction on Information Forensics and Security 14(5):1280–1295

    Article  Google Scholar 

  40. 40.

    You H, Tian S, Yu L (2019) Pixel-level remote sensing image recognition based on bidirectional word vectors. IEEE Trans Geosci Remote Sens 58(2):1281–1293

    Article  Google Scholar 

  41. 41.

    Yu SC, Yun S, Songzhi C, Guorong LS (2017) Stratified pooling based deep convolutional neural networks for human action recognition. Multimed Tools Appl 76(11):13367–13382

    Article  Google Scholar 

  42. 42.

    Zhang T, Yang Z, Jia W, Yang B, Yang J, He X (2016) A new method for violence detection in surveillance scenes. Multimed Tools Appl 75(12):7327–7349

    Article  Google Scholar 

  43. 43.

    Zhang LZ, Guangming S, Peiyi S, Juan AS, Bennamoun M (2017) Learning spatiotemporal features using 3DCNN and convolutional LSTM for gesture recognition. in Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops.

  44. 44.

    Zhu Y, Newsam S (2019) Motion-aware feature for improved video anomaly detection. in British Machine Vision Conference. BMVC.

Download references

Acknowledgments

We acknowledge partial support from the National Center of Big Data and Cloud Computing (NCBC) and HEC of Pakistan for conducting this research.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Usama Ijaz Bajwa.

Ethics declarations

Conflicts of interest/competing interests

No conflict of interest

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Maqsood, R., Bajwa, U.I., Saleem, G. et al. Anomaly recognition from surveillance videos using 3D convolution neural network. Multimed Tools Appl (2021). https://doi.org/10.1007/s11042-021-10570-3

Download citation

Keywords

  • Anomalous activity recognition
  • 3DConvNets
  • Spatial augmentation
  • Spatial annotation