Abstract
We propose a robust and efficient method for accurate detecting and localizing complex human action in video in space and time dimensions using spatio-temporal templates. A simple but effective motion descriptor based on the motion-compensated frame difference is designed for template representation, which is resistant to the deformation of posture and cluttered and moving background. A multi-step filtering scheme is adopted to speed up the target candidates localization and matching to the templates. For the template sequence to video registration, we present an extended continuous dynamic programming technique which can compute the matching scores for multiple trajectories simultaneously. Extensive experimental results on different videos have demonstrated the effectiveness of the proposed method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bobick, A., Davis, J.: The Representation and Recognition of Action Using Temporal Templates. IEEE Trans. on PAMI 23(3), 257–267 (2001)
Yamato, J., et al.: Recognizing Human Action in Time-Sequential Images using Hidden Markov Model. In: CVPR (1992)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV (2003)
Zhu, G.-Y., Xu, C.S., Gao, W., Huang, Q.: Action Recognition in Broadcast Tennis Video Using Optical Flow and Support Vector Machine. In: Huang, T.S., Sebe, N., Lew, M., Pavlović, V., Kölsch, M., Galata, A., Kisačanin, B. (eds.) HCI/ECCV 2006. LNCS, vol. 3979, pp. 89–98. Springer, Heidelberg (2006)
Song, Y., Zheng, Y.-T., Tang, S., Zhou, X., Zhang, Y., Lin, S., Chua, T.-S.: Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos. IEEE Trans. Circuits Syst. Video Techn. 21(9), 1193–1202 (2011)
Li, H., Tang, J., Wu, S., Zhang, Y., Lin, S.: Automatic Detection and Analysis of Player Action in Moving Background Sports Video Sequences. IEEE Trans. Circuits Syst. Video Techn. 20(3), 351–364 (2010)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. ACM Multimedia (2007)
Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.-S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR (2009)
Shechtman, E., Irani, M.: Space-Time Behavior Based Correlation. In: CVPR (2005)
Jiang, H., Li, Z.N., Drew, M.S.: Detecting Human Action in Active Video. In: ICME (2006)
Viola, P., Jones, M.: Rapid Object Detection using a Boosted Cascade of Simple Features. In: CVPR (2001)
Yilmaz, A., Javed, O., Shah, M.: ACM Computing Surveys 38(4) (2006)
Zhang, H., Guo, Y.: Facial Expression Recognition Using Continuous Dynamic Programming. In: ICCV Workshop on RATFGRT (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, H., Sun, F., Guan, Y. (2013). Robust Detection and Localization of Human Action in Video. In: Li, S., et al. Advances in Multimedia Modeling. Lecture Notes in Computer Science, vol 7733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35728-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-35728-2_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35727-5
Online ISBN: 978-3-642-35728-2
eBook Packages: Computer ScienceComputer Science (R0)