Combining Skeletal Pose with Local Motion for Human Activity Recognition

Xu, Ran; Agarwal, Priyanshu; Kumar, Suren; Krovi, Venkat N.; Corso, Jason J.

doi:10.1007/978-3-642-31567-1_11

Ran Xu¹⁹,
Priyanshu Agarwal²⁰,
Suren Kumar²⁰,
Venkat N. Krovi²⁰ &
…
Jason J. Corso¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7378))

Included in the following conference series:

International Conference on Articulated Motion and Deformable Objects

1449 Accesses
14 Citations

Abstract

Recent work in human activity recognition has focused on bottom-up approaches that rely on spatiotemporal features, both dense and sparse. In contrast, articulated motion, which naturally incorporates explicit human action information, has not been heavily studied; a fact likely due to the inherent challenge in modeling and inferring articulated human motion from video. However, recent developments in data-driven human pose estimation have made it plausible. In this paper, we extend these developments with a new middle-level representation called dynamic pose that couples the local motion information directly and independently with human skeletal pose, and present an appropriate distance function on the dynamic poses. We demonstrate the representative power of dynamic pose over raw skeletal pose in an activity recognition setting, using simple codebook matching and support vector machines as the classifier. Our results conclusively demonstrate that dynamic pose is a more powerful representation of human action than skeletal pose.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Laptev, I.: On space-time interest points. In: IJCV (2005)
Google Scholar
Klaser, A., Marszalek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC (2008)
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR, pp. 3169–3176 (2011)
Google Scholar
Rodriguez, M.D., Ahmed, J., Shah, M.: Action mach: A spatio-temporal maximum average correlation height filter for action recognition. In: CVPR (2008)
Google Scholar
Niebles, J.C., Chen, C.-W., Fei-Fei, L.: Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 392–405. Springer, Heidelberg (2010)
Chapter Google Scholar
Gaidon, A., Harchaoui, Z., Schmid, C.: A time series kernel for action recognition. In: BMVC (2011)
Google Scholar
Ali, S., Basharat, A., Shah, M.: Chaotic invariants for human action recognition. In: ICCV (2007)
Google Scholar
Ramanan, D., Forsyth, D.A.: Automatic annotation of everyday movements. In: NIPS (2003)
Google Scholar
Shakhnarovich, G., Viola, P., Darrell, T.: Fast Pose Estimation with Parameter-Sensitive Hashing. In: ICCV (2003)
Google Scholar
Yang, Y., Ramanan, D.: Articulated pose estimation with flexible mixtures-of-parts. In: CVPR (2011)
Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. TPAMI 32, 1627–1645 (2010)
Article Google Scholar
Bourdev, L., Malik, J.: Poselets: Body part detectors trained using 3d human pose annotations. In: ICCV (2009)
Google Scholar
Yao, A., Gall, J., Fanelli, G., Gool, L.V.: Does human action recognition benefit from pose estimation? In: BMVC (2011)
Google Scholar
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: ICPR (2004)
Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. TPAMI 29(12), 2247–2253 (2007)
Article Google Scholar
Essa, I., Pentland, A.: Coding, analysis, interpretation and recognition of facial expressions. TPAMI 19(7), 757–763 (1997)
Article Google Scholar
Derpanis, K.G., Sizintsev, M., Cannons, K., Wildes, R.P.: Efficient action spotting based on a spacetime oriented structure representation. In: CVPR (2010)
Google Scholar
Tran, K.N., Kakadiaris, I.A., Shah, S.K.: Modeling motion of body parts for action recognition. In: BMVC (2011)
Google Scholar
Brendel, W., Todorovic, S.: Activities as Time Series of Human Postures. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 721–734. Springer, Heidelberg (2010)
Chapter Google Scholar
Lin, Z., Jiang, Z., Davis, L.S.: Recognizing actions by shape-motion prototype trees. In: ICCV (2009)
Google Scholar
Maji, S., Berg, A.C., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: CVPR (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science and Engineering, State University of New York at Buffalo, NY, USA
Ran Xu & Jason J. Corso
Mechanical and Aerospace Engineering, State University of New York at Buffalo, NY, USA
Priyanshu Agarwal, Suren Kumar & Venkat N. Krovi

Authors

Ran Xu
View author publications
You can also search for this author in PubMed Google Scholar
Priyanshu Agarwal
View author publications
You can also search for this author in PubMed Google Scholar
Suren Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Venkat N. Krovi
View author publications
You can also search for this author in PubMed Google Scholar
Jason J. Corso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Mathematics, UIB – Universitat de les Illes Balears, C/ Valldemossa km 7.5, PC 07122, Palma de Mallorca, Spain
Francisco J. Perales
School of Informatics, University of Edinburgh, 1.26 Informatics Forum, 10 Crichton St., EH8 9AB, Edinburgh, UK
Robert B. Fisher
Dept. for Architecture, Design and Media Technology, Aalborg University, Niels Jernes Vej 14, 9220, Aalborg East, Denmark
Thomas B. Moeslund

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, R., Agarwal, P., Kumar, S., Krovi, V.N., Corso, J.J. (2012). Combining Skeletal Pose with Local Motion for Human Activity Recognition. In: Perales, F.J., Fisher, R.B., Moeslund, T.B. (eds) Articulated Motion and Deformable Objects. AMDO 2012. Lecture Notes in Computer Science, vol 7378. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31567-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-31567-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31566-4
Online ISBN: 978-3-642-31567-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics