Human Interaction Recognition by Spatial Structure Models

Wu, Jianzhai; Chen, Fanglin; Hu, Dewen

doi:10.1007/978-3-642-42057-3_28

Jianzhai Wu²¹,
Fanglin Chen²¹ &
Dewen Hu²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8261))

Included in the following conference series:

International Conference on Intelligent Science and Big Data Engineering

2389 Accesses
2 Citations

Abstract

In this paper, we focus on the recognition and localization of human interactions in real-world videos. It is a difficult challenge because of large variations in person appearance, camera viewpoint, length of video, intra-class variability, and etc. To address these challenges, we present a spatial structure model in this paper. In our model, the crucial movement of each category is represented using a segment of the entire video. To capture the spatial configuration of the human interactions within the video segment, a spatial structure model is built over the segment, and trajectory features are extracted within each cell. The proposed model is trained automatically from real-world videos that are annotated only with the classification label. We examine our approach on the TVHI dataset, which contain 4 complex human interaction action classes. The experimental results demonstrate the effectiveness of our model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. CVPR (2005)
Google Scholar
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 428–441. Springer, Heidelberg (2006)
Chapter Google Scholar
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: Liblinear: A library for large linear classification. J. Mach. Learn. Res. (2008)
Google Scholar
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proc. ICCV (2007)
Google Scholar
Laptev, I., Lindeberg, T.: Space-time interest points. In: Proc. ICCV (2003)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. CVPR (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proc. CVPR (2006)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos “in the wild”. In: Proc. CVPR (2009)
Google Scholar
Liu, J., Shah, M.: Learning human actions via information maximization. In: Proc. CVPR (2008)
Google Scholar
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proc. CVPR (2009)
Google Scholar
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling temporal structure of decomposable motion segments for activity classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Patron-Perez, A., Marszalek, M., Reid, I., Zisserman, A.: Structured learning of human interactions in tv shows. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2441–2453 (2012)
Article Google Scholar
Patron-Perez, A., Marszalek, M., Zisserman, A., Reid, I.: High five: Recognising human interactions in tv shows. In: Proc. BMVC (2010)
Google Scholar
Rodriguez, M., Ahmed, J., Shah, M.: Action mach: a spatio-temporal maximum average correlation height filter for action recognition. In: Proc. CVPR (2008)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proc. ICPR (2004)
Google Scholar
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proc. ACM Multimedia (2007)
Google Scholar
Tang, K., Fei-Fei, L., Koller, D.: Learning latent temporal structure for complex event detection. In: Proc. CVPR (2012)
Google Scholar
Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: Proc. ICML (2004)
Google Scholar
Vedaldi, A., Zisserman, A.: Efficient additive kernels via explicit feature maps. In: Proc. CVPR (2010)
Google Scholar
Wang, H., Ullah, M., Klaser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: Proc. BMVC (2009)
Google Scholar
Wang, H., Klaser, A., Schimid, C., Liu, C.L.: Action recognition by dense trajectories. In: Proc. CVPR (2011)
Google Scholar
Yu, C.N.J., Joachims, T.: Learning structural svms with latent variables. In: Proc. ICML (2009)
Google Scholar
Yuille, A., Rangarajan, A.: The concave-convex procedure (cccp). In: Proc. NIPS, pp. 1033–1040 (2001)
Google Scholar
Zhu, L., Chen, Y., Yuille, A., Freeman, W.: Latent hierarchical structural learning for object detection. In: Proc. CVPR (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Mechatronic Engineering and Automation, National University of Defense Technology, Changsha, Hunan, P.R. China, 410073
Jianzhai Wu, Fanglin Chen & Dewen Hu

Authors

Jianzhai Wu
View author publications
You can also search for this author in PubMed Google Scholar
Fanglin Chen
View author publications
You can also search for this author in PubMed Google Scholar
Dewen Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Automation and Electrical Engineering, University of Science and Technology, Xueyuan Road No. 30, 100083, Beijing, China
Changyin Sun
Department of Psychology, Peking University, Yiheyuan Road No. 5, 100871, Beijing, China
Fang Fang
Department of Computer Science and Technology, Nanjing University, Xianlin Avenue No. 163, 210023, Nanjing, China
Zhi-Hua Zhou
School of Automation, Southeast University, Sipailou No. 2, 210096, Nanjing, China
Wankou Yang
Institute of Automation, Chinese Academy of Sciences, No. 95 East Zhongguancun Road, 100190, Beijing, China
Zhi-Yong Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, J., Chen, F., Hu, D. (2013). Human Interaction Recognition by Spatial Structure Models. In: Sun, C., Fang, F., Zhou, ZH., Yang, W., Liu, ZY. (eds) Intelligence Science and Big Data Engineering. IScIDE 2013. Lecture Notes in Computer Science, vol 8261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-42057-3_28

Download citation

DOI: https://doi.org/10.1007/978-3-642-42057-3_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-42056-6
Online ISBN: 978-3-642-42057-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics