A Optical Flow-Based Fight Behavior Detection Method for Campus Scene

Yang, Shu; Li, Yali; Wang, Shengjin

doi:10.1007/978-981-99-7549-5_14

Shu Yang⁷,
Yali Li⁷ &
Shengjin Wang⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1910))

Included in the following conference series:

Chinese Conference on Image and Graphics Technologies

Abstract

Campuses contain a large number of facilities that must all be monitored to ensure security. However, most of the existing video surveillance needs to be watched by people, and it is impossible to realize the automatic early warning of some dangerous situations. In this paper, a video-based action detection method is proposed for high-frequency student fight on campus, which uses an optical flow algorithm to perform coarse positioning of the area where fight actions may occur and uses the transformer network to identify the action category of the region of interest. In addition, this paper builds a dataset of fight recognition in middle school campuses for model training, validation and testing. The experimental results show that the method proposed in this paper can locate fight actions relatively accurately and provide real-time early warning.

Supported by China Postdoctoral Science Foundation (Grant No. 2022M721893).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Xu, Y.: Research on the design of smart campus system based on big data and internet of things. China Computer and Communication (2019)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: a dataset of 101 human actions classes from videos in the wild. Computer Science (2012)
Google Scholar
Kay, W., Carreira, J., Simonyan, K., Zhang, B., Zisserman, A.: The kinetics human action video dataset (2017)
Google Scholar
Xu, H., Das, A., Saenko, K.: R-c3d: region convolutional 3d network for temporal activity detection. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp.5794–5803 (2017)
Google Scholar
Chao, Y.W., Vijayanarasimhan, S., Seybold, B., Ross, D.A., Deng, J., Sukthankar, R.: Rethinking the faster R-CNN architecture for temporal action localization. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1130–1139 (2018)
Google Scholar
Long, F., Yao, T., Qiu, Z., Tian, X., Luo, J., Mei, T.: Gaussian temporal awareness networks for action localization. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 344–353 (2019)
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. Int. J. Comput. Vis. 103(1), 60–79 (2013)
Article MathSciNet Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: 2013 IEEE International Conference on Computer Vision (2014)
Google Scholar
Akila, K., Chitrakala, S.: Discriminative human action recognition using hoi descriptor and key poses. In: 2014 International Conference on Science Engineering and Management Research, pp. 1–6 (2014)
Google Scholar
Wang, X., Chen, D., Feng, H., Yang, T., Bo, H.U.: Action recognition based on object detection and dense trajectories. J. Fudan Univ. (Nat. Sci.) (2016)
Google Scholar
Min, J., Kasturi, R.: Activity recognition based on multiple motion trajectories. In: Proceedings of the 17th International Conference on Pattern Recognition 2004, ICPR 2004 (2004)
Google Scholar
Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3d skeletons as points in a lie group. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–595 (2014)
Google Scholar
Wang, H., et al.: Understanding the robustness of skeleton-based action recognition under adversarial attack. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14651–14660 (2021)
Google Scholar
Devanne, M., Wannous, H., Berretti, S., Pala, P., Daoudi, M., Del Bimbo, A.: 3-d human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Trans. Cybern. 45(7), 1340–1352 (2015)
Article Google Scholar
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1110–1118 (2015)
Google Scholar
Zhang, P., Lan, C., Xing, J., Zeng, W., Xue, J., Zheng, N.: View adaptive neural networks for high performance skeleton-based human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(8), 1963–1978 (2019)
Article Google Scholar
Shi, L., Zhang, Y., Cheng, J., Lu, H.: Skeleton-based action recognition with directed graph neural networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7904–7913 (2019)
Google Scholar
Cheng, K., Zhang, Y., He, X., Chen, W., Cheng, J., Lu, H.: Skeleton-based action recognition with shift graph convolutional network. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 180–189 (2020)
Google Scholar
Liu, Z., Zhang, H., Chen, Z., Wang, Z., Ouyang, W.: Disentangling and unifying graph convolutions for skeleton-based action recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 140–149 (2020)
Google Scholar
Zhang, X., Xu, C., Tao, D.: Context aware graph convolution for skeleton-based action recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 14321–14330 (2020)
Google Scholar
Zhang, P., Lan, C., Zeng, W., Xing, J., Xue, J., Zheng, N.: Semantics-guided neural networks for efficient skeleton-based human action recognition. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1109–1118 (2020)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, vol. 1 (2014)
Google Scholar
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1933–1941 (2016)
Google Scholar
Carreira, J., Zisserman, A: Quo vadis, action recognition? a new model and the kinetics dataset. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724–4733 (2017)
Google Scholar
Feichtenhofer, C., Fan, H., Malik, J., He, K.: Slowfast networks for video recognition. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 6201–6210 (2019)
Google Scholar
Graham, W.T., Fergus, R., Lecun, Y., Bregler, C.: Convolutional learning of spatio-temporal features. In: European Conference on Computer Vision (2010)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4489–4497 (2015)
Google Scholar
Qiu, Z., Yao, T., Mei, T.: Learning spatio-temporal representation with pseudo-3d residual networks. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 5534–5542 (2017)
Google Scholar
Lucas, B.D.: Generalized Image Matching by the Method of Differences. Carnegie Mellon University (1985)
Google Scholar
Sharma, S., Kiros, R., Salakhutdinov, R.: Action recognition using visual attention (2015)
Google Scholar
Dai, C., Liu, X., Lai, J.: Human action recognition using two-stream attention based LSTM networks. Appl. Soft Comput. 86, 105820 (2020)
Article Google Scholar
Liu, Z., et al.: Video swin transformer. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3192–3201 (2022)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)
Google Scholar
Authors, P.: Paddledetection, object detection and instance segmentation toolkit based on paddlepaddle. https://github.com/PaddlePaddle/PaddleDetection (2019)
Yun, S., Oh, S.J., Heo, B., Han, D., Kim, J.: Videomix: Rethinking data augmentation for video classification (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China
Shu Yang, Yali Li & Shengjin Wang

Authors

Shu Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yali Li
View author publications
You can also search for this author in PubMed Google Scholar
Shengjin Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shu Yang .

Editor information

Editors and Affiliations

Beijing Institute of Technology, Beijing, China
Wang Yongtian
Beijing University of Technology, Beijing, China
Wu Lifang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, S., Li, Y., Wang, S. (2023). A Optical Flow-Based Fight Behavior Detection Method for Campus Scene. In: Yongtian, W., Lifang, W. (eds) Image and Graphics Technologies and Applications. IGTA 2023. Communications in Computer and Information Science, vol 1910. Springer, Singapore. https://doi.org/10.1007/978-981-99-7549-5_14

Download citation

DOI: https://doi.org/10.1007/978-981-99-7549-5_14
Published: 25 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7548-8
Online ISBN: 978-981-99-7549-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Optical Flow-Based Fight Behavior Detection Method for Campus Scene