3D Action Classification Using Sparse Spatio-temporal Feature Representations

Azary, Sherif; Savakis, Andreas

doi:10.1007/978-3-642-33191-6_17

Sherif Azary²⁸ &
Andreas Savakis²⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7432))

Included in the following conference series:

International Symposium on Visual Computing

3066 Accesses
6 Citations

Abstract

Automatic action classification is a challenging task for a wide variety of reasons including unconstrained human motion, background clutter, and view dependencies. The introduction of affordable depth sensors allows opportunities to investigate new approaches for action classification that take advantage of depth information. In this paper, we perform action classification using sparse representations on 3D video sequences of spatio-temporal kinematic joint descriptors and compare the classification accuracy against spatio-temporal raw depth data descriptors. These descriptors are used to create over-complete dictionaries which are used to classify test actions using least squares loss L₁-norm minimization with a regularization parameter. We find that the representations of raw depth features are naturally more sparse than kinematic joint features and that our approach is highly effective and efficient at classifying a wide variety of actions from the Microsoft Research 3D Dataset (MSR3D).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ghasemzadeh, H., Loseu, V., Jafari, R.: Collaborative Signal Processing for Action Recognition in Body Sensor Networks: A Distributed Classification Algorithm Using Mo-tion Transcripts. In: Proc. 9th ACM/IEEE Int. Conf. Inf. Process. (2010)
Google Scholar
Raja, K., Laptev, I., Perez, P., Oisel, L.: Joint pose estimation and action recognition in image graphs. In: 18th IEEE International Conference on International Conference on Image Processing, ICIP (2011)
Google Scholar
Weinland, D., Boyer, E., Ronfard, R.: Action Recognition from Arbitrary Views using 3D Exemplars. In: IEEE ICCV (2007)
Google Scholar
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2011)
Google Scholar
Wang, Y., Zhang, Z.: View-invariant action recognition in surveillance videos. In: First Asian Conference on Pattern Recognition, ACPR (2011)
Google Scholar
Imtiaz, H., Mahbub, U., Ahad, M.A.R.: Action recognition algorithm based on optical flow and RANSAC in frequency domain. In: Proceedings of SICE Annual Conference, SICE (2011)
Google Scholar
Ahad, M.A.R., Tan, J., Kim, H., Ishikawa, S.: Action recognition by employing combined directional motion history and energy images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW (2010)
Google Scholar
Lopes, A. P. B., Oliveira, R. S., de Almeida, J. M., de A Araujo, A.: Comparing alternatives for capturing dynamic information in Bag-of-Visual-Features approaches applied to human actions recognition. In: IEEE International Workshop on Multimedia Signal Processing, MMSP (2009)
Google Scholar
Liu, J., Yang, J., Zhang, Y., He, X.: Action Recognition by Multiple Features and Hyper-Sphere Multi-class SVM. In: 20th International Conference on Pattern Recognition, ICPR (2010)
Google Scholar
Ji, X., Liu, H., Li, Y.: Human actions recognition using Fuzzy PCA and discriminative hidden model. In: IEEE International Conference on Fuzzy Systems, FUZZ (2010)
Google Scholar
Azary, S., Savakis, A.: View Invariant Activity Recognition with Manifold Learning. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammound, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) ISVC 2010. LNCS, vol. 6454, pp. 606–615. Springer, Heidelberg (2010)
Chapter Google Scholar
Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-Independent Action Recognition from Temporal Self-Similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)
Google Scholar
Gall, J., Yao, A., Razavi, N., Gool, L.V., Lempitsky, V.: Hough Forests for Object Detection, Tracking, and Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)
Google Scholar
Laptev, I.: On Space-Time Interest Points. International Journal of Computer Vision (2005)
Google Scholar
Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)
Chapter Google Scholar
Juran, J.M.: The non-Pareto Principle: Mea culpa. Quality Progress, 8–9 (May 1975)
Google Scholar
Farmer, J.D., Geanakoplos, J.: Power laws in economics and elsewhere. Santa Fe Institute, Santa Fe (2008)
Google Scholar
West, G.B.: The Origin of Universal Scaling Laws in Biology. Oxford University Press, New York (1999)
Google Scholar
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: IMC 2007 Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, New York, NY (2007)
Google Scholar
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust Face Recognition via Sparse Representation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI (2009)
Google Scholar
Qiu, F., Xu, Y., Wang, C., Yang, Y.: Noisy image super-resolution with sparse mixing estimators. In: 4th International Congress on Image and Signal Processing, CISP (2011)
Google Scholar
Bao, L., Liu, W., Zhu, Y., Pu, Z, Magnin: Sparse representation based MRI denoising with total variation. In: 9th International Conference on Signal Processing, ICSP (2008)
Google Scholar
Zuo, Y., Zhang, B.: General image classification based on sparse representation. In: 9th IEEE International Conference on Cognitive Informatics, ICCI (2010)
Google Scholar
Zhang, J., Wang, Y., Chen, J., Li, Q.: Sparse Representation for Action Recognition. In: 3rd International Congress on Image and Signal Processing, CISP 2010 (2010)
Google Scholar
Liu, C., Yang, Y., Chen, Y.: Human Action Recognition using Sparse Representation. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS (2009)
Google Scholar
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-Time Human Pose Recognition in Parts from Single Depth Images. In: CVPR (2011)
Google Scholar
Wright, J., Ma, Y., Maira, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse Representation for Computer Vision and Pattern Recognition. Proceedings of the IEEE 98(6), 1031–1044 (2010)
Article Google Scholar
Miller, S.J.: The Method of Least Squares. Mathematics Department Brown University, Providence, RI (2006)
Google Scholar
Bektaş, S., Şişman, Y.: The comparison of L1 and L2-norm minimization methods. International Journal of the Physical Sciences, IJPS (2010)
Google Scholar
Donoho, D.L., Elad, M., Temlyakov, V.: Stable recovery of sparse overcomplete representations. IEEE Transactions on Information Theory (2005)
Google Scholar
Donoho, D.L., Tsaig, Y.: Fast Solution of L1-norm Minimization Problems When the Solution May Be Sparse, Stanford CA, 94305, Department of Statistics, Stanford University (2006)
Google Scholar
Schmidt, M.: Least Squares Optimization with L1-Norm Regularization. University of British Columbia (2005)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR Workshop, San Fransisco, CA (June 2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Computing and Information Sciences and Computer Engineering, Rochester Institute of Technology, Rochester, USA, NY, 14623
Sherif Azary & Andreas Savakis

Authors

Sherif Azary
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Savakis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Nevada, 89557, Reno, NV, USA
George Bebis
NASA Ames Research Center, 94035, Moffett Field, CA, USA
Richard Boyle
Lawrence Berkeley National Laboratory, 94720, Berkeley, CA, USA
Bahram Parvin
Desert Research Institute, 89512, Reno, NV, USA
Darko Koracin
Department of Computer Science, University of California at Irvine, 92697-3435, Irvine, CA, USA
Charless Fowlkes
Eastman Kodak Company, 14650-2102, Rochester, NY, USA
Sen Wang
Department of Computer Science and Engineering, University of Colorado at Denver, 80217, Denver, CO, USA
Min-Hyung Choi
VRVis Zentrum für Virtual Reality and Visualisierung, 1220, Vienna, Austria
Stephan Mantler
California Institute for Telecommunications and Information Technology, University of California,,, San Diego, 92093, La Jolla, CA, USA
Jürgen Schulze
KAUST Visualizatioin Core Lab., 23955-6900, Thurwal, Saudi Arabia
Daniel Acevedo
Stony Brook University, 11794-4400, NY, USA
Klaus Mueller
Argonne National Laboratory, 60439, IL, USA
Michael Papka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Azary, S., Savakis, A. (2012). 3D Action Classification Using Sparse Spatio-temporal Feature Representations. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2012. Lecture Notes in Computer Science, vol 7432. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33191-6_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-33191-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33190-9
Online ISBN: 978-3-642-33191-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics