Soccer Video Event Detection Using 3D Convolutional Networks and Shot Boundary Detection via Deep Feature Distance

Liu, Tingxi; Lu, Yao; Lei, Xiaoyu; Zhang, Lijing; Wang, Haoyu; Huang, Wei; Wang, Zijian

doi:10.1007/978-3-319-70096-0_46

Tingxi Liu¹⁸,
Yao Lu¹⁸,
Xiaoyu Lei¹⁸,
Lijing Zhang¹⁸,
Haoyu Wang¹⁸,
Wei Huang¹⁸ &
…
Zijian Wang^18,19

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10635))

Included in the following conference series:

International Conference on Neural Information Processing

8097 Accesses
13 Citations

Abstract

In this work, we propose a novel framework combining temporal action localization and play-break (PB) rules for soccer video event detection. Firstly we treat event detection task in action-level, and adopt 3D convolutional networks to perform action localization. Then we employ PB rules to organize actions into events using long view and replay logo detected in the first step. Finally, we determine the semantic classes of events according to principal actions which contain key semantic information of highlights. For long untrimmed videos, we propose a shot boundary detection method using deep feature distance (DFD) to reduce the number of proposals and improve the performance of localization. Experiment results verify the effectiveness of our framework on a new dataset which contains 152 classes of semantic actions and scenes in soccer video.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhao, W., Lu, Y., Jiang, H., Huang, W.: Event detection in soccer videos using shot focus identification. In: 3rd IAPR Asian Conference on Pattern Recognition (ACPR) 2015, pp. 341–345. IEEE (2015)
Google Scholar
Jiang, H., Lu, Y., Xue, J.: Automatic soccer video event detection based on a deep neural network combined cnn and rnn. In: IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI) 2016, pp. 490–494. IEEE (2016)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Google Scholar
Shou, Z., Wang, D., Chang, S.F.: Temporal action localization in untrimmed videos via multi-stage cnns. In: CVPR (2016)
Google Scholar
Escorcia, V., Caba Heilbron, F., Niebles, J.C., Ghanem, B.: DAPs: deep action proposals for action understanding. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 768–784. Springer, Cham (2016). doi:10.1007/978-3-319-46487-9_47
Chapter Google Scholar
Shou, Z., Chan, J., Zareian, A., Miyazawa, K., Chang, S.F.: Cdc: Convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos. arXiv preprint arXiv:1703.01515 (2017)
Buch, S., Escorcia, V., Shen, C., Ghanem, B., Niebles, J.C.: Sst: Single-stream temporal action proposals. In: CVPR (2017)
Google Scholar
By, H.A.: Shot-boundary detection: unraveled and resolved. IEEE Trans. Circ. Syst. Video Technol. 12(2), 90–105 (2010)
Google Scholar
Tsamoura, E., Mezaris, V., Kompatsiaris, I.: Gradual transition detection using color coherence and other criteria in a video shot meta-segmentation framework. In: IEEE International Conference on Image Processing, pp. 45–48 (2008)
Google Scholar
Wang, H., Klaser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Computer Vision and Pattern Recognition, pp. 3169–3176 (2011)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3551–3558 (2013)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, pp. 568–576 (2014)
Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3d convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Article Google Scholar
Wang, L., Xiong, Y., Wang, Z., Qiao, Y., Lin, D., Tang, X., Gool, L.V.: Temporal segment networks: towards good practices for deep action recognition. In: European Conference on Computer Vision, pp. 20–36 (2016)
Google Scholar
Jiang, Y.G., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: Action recognition with a large number of classes (2014). http://crcv.ucf.edu/THUMOS14/
Soomro, K., Zamir, A.R., Shah, M.: Ucf101: A dataset of 101 human actions classes from videos in the wild. Computer Science (2012)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: Convolutional architecture for fast feature embedding, pp. 675–678 (2014)
Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (No. 61273273).

Author information

Authors and Affiliations

Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, 100081, China
Tingxi Liu, Yao Lu, Xiaoyu Lei, Lijing Zhang, Haoyu Wang, Wei Huang & Zijian Wang
China Central Television, Beijing, China
Zijian Wang

Authors

Tingxi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Lei
View author publications
You can also search for this author in PubMed Google Scholar
Lijing Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Haoyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wei Huang
View author publications
You can also search for this author in PubMed Google Scholar
Zijian Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yao Lu .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, T. et al. (2017). Soccer Video Event Detection Using 3D Convolutional Networks and Shot Boundary Detection via Deep Feature Distance. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10635. Springer, Cham. https://doi.org/10.1007/978-3-319-70096-0_46

Download citation

DOI: https://doi.org/10.1007/978-3-319-70096-0_46
Published: 26 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70095-3
Online ISBN: 978-3-319-70096-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics