Global Flow and Temporal-Shape Descriptors for Human Action Recognition from 3D Reconstruction Data

Papadopoulos, Georgios Th.; Daras, Petros

doi:10.1007/978-3-319-62416-7_4

Georgios Th. Papadopoulos¹⁴ &
Petros Daras¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10358))

Included in the following conference series:

International Conference on Machine Learning and Data Mining in Pattern Recognition

Abstract

In this paper, global-level view-invariant descriptors for human action recognition using 3D reconstruction data are proposed. 3D reconstruction techniques are employed for addressing two of the most challenging issues related to human action recognition in the general case, namely view-variance and the presence of (self-) occlusions. Initially, a set of calibrated Kinect sensors are employed for producing a 3D reconstruction of the performing subjects. Subsequently, a 3D flow field is estimated for every captured frame. For performing action recognition, a novel global 3D flow descriptor is introduced, which achieves to efficiently encode the global motion characteristics in a compact way, while also incorporating spatial distribution related information. Additionally, a new global temporal-shape descriptor that extends the notion of 3D shape descriptions for action recognition, by including temporal information, is also proposed. The latter descriptor efficiently addresses the inherent problems of temporal alignment and compact representation, while also being robust in the presence of noise. Experimental results using public datasets demonstrate the efficiency of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Borges, P.V.K., Conci, N., Cavallaro, A.: Video-based human behavior understanding: a survey. IEEE Trans. Circuits Syst. Video Technol. 23(11), 1993–2008 (2013)
Article Google Scholar
Budd, C., Huang, P., Klaudiny, M., Hilton, A.: Global non-rigid alignment of surface sequences. Int. J. Comput. Vis. 102(1–3), 256–270 (2013)
Article MathSciNet Google Scholar
Cai, X., Zhou, W., Wu, L., Luo, J., Li, H.: Effective active skeleton representation for low latency human action recognition. IEEE Trans. Multimedia 18(2), 141–154 (2016)
Article Google Scholar
Cheng, Z., Qin, L., Ye, Y., Huang, Q., Tian, Q.: Human daily action analysis with multi-view and color-depth data. In: Fusiello, A., Murino, V., Cucchiara, R. (eds.) ECCV 2012. LNCS, vol. 7584, pp. 52–61. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33868-7_6
Chapter Google Scholar
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 726–733. IEEE (2003)
Google Scholar
Fanello, S.R., Gori, I., Metta, G., Odone, F.: Keep it simple and sparse: real-time action recognition. J. Mach. Learn. Res. 14(1), 2617–2640 (2013)
Google Scholar
Gorelick, L., Blank, M., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2247–2253 (2007)
Article Google Scholar
Gori, I., Fanello, S.R., Odone, F., Metta, G.: A compositional approach for 3D arm-hand action recognition. In: 2013 IEEE Workshop on Robot Vision (WORV), pp. 126–131. IEEE (2013)
Google Scholar
Holte, M.B., Chakraborty, B., Gonzalez, J., Moeslund, T.B.: A local 3-D motion descriptor for multi-view human action recognition from 4-D spatio-temporal interest points. IEEE J. Sel. Top. Sig. Process. 6(5), 553–565 (2012)
Article Google Scholar
Huang, P., Hilton, A., Starck, J.: Shape similarity for 3D video sequences of people. Int. J. Comput. Vis. 89(2–3), 362–381 (2010)
Article Google Scholar
Ji, X., Liu, H.: Advances in view-invariant human motion analysis: a review. IEEE Trans. Syst. Man. Cybern. Part C Appl. Rev. 40(1), 13–24 (2010)
Google Scholar
Munaro, M., Ballin, G., Michieletto, S., Menegatti, E.: 3D flow estimation for human action recognition from colored point clouds. Biologically Inspired Cogn. Architectures 5, 42–51 (2013)
Article Google Scholar
Ohkita, Y., Ohishi, Y., Furuya, T., Ohbuchi, R.: Non-rigid 3D model retrieval using set of local statistical features. In: 2012 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), pp. 593–598. IEEE (2012)
Google Scholar
Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Shape distributions. ACM Trans. Graph. (TOG) 21(4), 807–832 (2002)
Article MathSciNet MATH Google Scholar
Papadopoulos, G.T., Axenopoulos, A., Daras, P.: Real-time skeleton-tracking-based human action recognition using kinect data. In: International Conference on MultiMedia Modeling, pp. 473–483 (2014)
Google Scholar
Papadopoulos, G.T., Daras, P.: Local descriptions for human action recognition from 3d reconstruction data. In: IEEE International Conference on Image Processing (ICIP 2014), pp. 2814–2818, November 2014
Google Scholar
Sizintsev, M., Wildes, R.P.: Spatiotemporal stereo and scene flow via stequel matching. IEEE Trans. Pattern Anal. Mach. Intell. 34(6), 1206–1219 (2012)
Article Google Scholar
Slama, R., Wannous, H., Daoudi, M.: 3D human motion analysis framework for shape similarity and retrieval. Image Vis. Comput. 32(2), 131–154 (2014)
Article Google Scholar
Sun, L., Aizawa, K.: Action recognition using invariant features under unexampled viewing conditions. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 389–392. ACM (2013)
Google Scholar
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1473–1488 (2008)
Article Google Scholar
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 872–885. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33709-3_62
Chapter Google Scholar
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297. IEEE (2012)
Google Scholar
Weinland, D., Ronfard, R., Boyer, E.: Free viewpoint action recognition using motion history volumes. Comput. Vis. Image Underst. 104(2), 249–257 (2006)
Article Google Scholar
Xia, L., Aggarwal, J.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2834–2841. IEEE (2013)
Google Scholar
Xia, L., Chen, C.-C., Aggarwal, J.: View invariant human action recognition using histograms of 3D joints. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 20–27. IEEE (2012)
Google Scholar
Xia, L., Gori, I., Aggarwal, J., Ryoo, M.: Robot-centric activity recognition from first-person rgb-d videos (2015)
Google Scholar
Yamasaki, T., Aizawa, K.: Motion segmentation and retrieval for 3D video based on modified shape distribution. EURASIP J. Appl. Sig. Process. 2007(1), 211–211 (2007)
MATH Google Scholar

Download references

Acknowledgment

The work presented in this paper was supported by the European Commission under contract H2020-700367 DANTE.

Author information

Authors and Affiliations

Information Technologies Institute, Centre for Research and Technology Hellas, 57001, Thermi, Thessaloniki, Greece
Georgios Th. Papadopoulos & Petros Daras

Authors

Georgios Th. Papadopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Petros Daras
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Georgios Th. Papadopoulos .

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, Leipzig, Sachsen, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Papadopoulos, G.T., Daras, P. (2017). Global Flow and Temporal-Shape Descriptors for Human Action Recognition from 3D Reconstruction Data. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2017. Lecture Notes in Computer Science(), vol 10358. Springer, Cham. https://doi.org/10.1007/978-3-319-62416-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-62416-7_4
Published: 02 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62415-0
Online ISBN: 978-3-319-62416-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics