Abstract
Automatic action classification is a challenging task for a wide variety of reasons including unconstrained human motion, background clutter, and view dependencies. The introduction of affordable depth sensors allows opportunities to investigate new approaches for action classification that take advantage of depth information. In this paper, we perform action classification using sparse representations on 3D video sequences of spatio-temporal kinematic joint descriptors and compare the classification accuracy against spatio-temporal raw depth data descriptors. These descriptors are used to create over-complete dictionaries which are used to classify test actions using least squares loss L1-norm minimization with a regularization parameter. We find that the representations of raw depth features are naturally more sparse than kinematic joint features and that our approach is highly effective and efficient at classifying a wide variety of actions from the Microsoft Research 3D Dataset (MSR3D).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ghasemzadeh, H., Loseu, V., Jafari, R.: Collaborative Signal Processing for Action Recognition in Body Sensor Networks: A Distributed Classification Algorithm Using Mo-tion Transcripts. In: Proc. 9th ACM/IEEE Int. Conf. Inf. Process. (2010)
Raja, K., Laptev, I., Perez, P., Oisel, L.: Joint pose estimation and action recognition in image graphs. In: 18th IEEE International Conference on International Conference on Image Processing, ICIP (2011)
Weinland, D., Boyer, E., Ronfard, R.: Action Recognition from Arbitrary Views using 3D Exemplars. In: IEEE ICCV (2007)
Maji, S., Bourdev, L., Malik, J.: Action recognition from a distributed representation of pose and appearance. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2011)
Wang, Y., Zhang, Z.: View-invariant action recognition in surveillance videos. In: First Asian Conference on Pattern Recognition, ACPR (2011)
Imtiaz, H., Mahbub, U., Ahad, M.A.R.: Action recognition algorithm based on optical flow and RANSAC in frequency domain. In: Proceedings of SICE Annual Conference, SICE (2011)
Ahad, M.A.R., Tan, J., Kim, H., Ishikawa, S.: Action recognition by employing combined directional motion history and energy images. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPRW (2010)
Lopes, A. P. B., Oliveira, R. S., de Almeida, J. M., de A Araujo, A.: Comparing alternatives for capturing dynamic information in Bag-of-Visual-Features approaches applied to human actions recognition. In: IEEE International Workshop on Multimedia Signal Processing, MMSP (2009)
Liu, J., Yang, J., Zhang, Y., He, X.: Action Recognition by Multiple Features and Hyper-Sphere Multi-class SVM. In: 20th International Conference on Pattern Recognition, ICPR (2010)
Ji, X., Liu, H., Li, Y.: Human actions recognition using Fuzzy PCA and discriminative hidden model. In: IEEE International Conference on Fuzzy Systems, FUZZ (2010)
Azary, S., Savakis, A.: View Invariant Activity Recognition with Manifold Learning. In: Bebis, G., Boyle, R., Parvin, B., Koracin, D., Chung, R., Hammound, R., Hussain, M., Kar-Han, T., Crawfis, R., Thalmann, D., Kao, D., Avila, L. (eds.) ISVC 2010. LNCS, vol. 6454, pp. 606–615. Springer, Heidelberg (2010)
Junejo, I.N., Dexter, E., Laptev, I., Perez, P.: View-Independent Action Recognition from Temporal Self-Similarities. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)
Gall, J., Yao, A., Razavi, N., Gool, L.V., Lempitsky, V.: Hough Forests for Object Detection, Tracking, and Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (2011)
Laptev, I.: On Space-Time Interest Points. International Journal of Computer Vision (2005)
Willems, G., Tuytelaars, T., Van Gool, L.: An efficient dense and scale-invariant spatio-temporal interest point detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)
Juran, J.M.: The non-Pareto Principle: Mea culpa. Quality Progress, 8–9 (May 1975)
Farmer, J.D., Geanakoplos, J.: Power laws in economics and elsewhere. Santa Fe Institute, Santa Fe (2008)
West, G.B.: The Origin of Universal Scaling Laws in Biology. Oxford University Press, New York (1999)
Mislove, A., Marcon, M., Gummadi, K.P., Druschel, P., Bhattacharjee, B.: Measurement and analysis of online social networks. In: IMC 2007 Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement, New York, NY (2007)
Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust Face Recognition via Sparse Representation. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI (2009)
Qiu, F., Xu, Y., Wang, C., Yang, Y.: Noisy image super-resolution with sparse mixing estimators. In: 4th International Congress on Image and Signal Processing, CISP (2011)
Bao, L., Liu, W., Zhu, Y., Pu, Z, Magnin: Sparse representation based MRI denoising with total variation. In: 9th International Conference on Signal Processing, ICSP (2008)
Zuo, Y., Zhang, B.: General image classification based on sparse representation. In: 9th IEEE International Conference on Cognitive Informatics, ICCI (2010)
Zhang, J., Wang, Y., Chen, J., Li, Q.: Sparse Representation for Action Recognition. In: 3rd International Congress on Image and Signal Processing, CISP 2010 (2010)
Liu, C., Yang, Y., Chen, Y.: Human Action Recognition using Sparse Representation. In: IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS (2009)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-Time Human Pose Recognition in Parts from Single Depth Images. In: CVPR (2011)
Wright, J., Ma, Y., Maira, J., Sapiro, G., Huang, T.S., Yan, S.: Sparse Representation for Computer Vision and Pattern Recognition. Proceedings of the IEEE 98(6), 1031–1044 (2010)
Miller, S.J.: The Method of Least Squares. Mathematics Department Brown University, Providence, RI (2006)
BektaÅŸ, S., ÅžiÅŸman, Y.: The comparison of L1 and L2-norm minimization methods. International Journal of the Physical Sciences, IJPS (2010)
Donoho, D.L., Elad, M., Temlyakov, V.: Stable recovery of sparse overcomplete representations. IEEE Transactions on Information Theory (2005)
Donoho, D.L., Tsaig, Y.: Fast Solution of L1-norm Minimization Problems When the Solution May Be Sparse, Stanford CA, 94305, Department of Statistics, Stanford University (2006)
Schmidt, M.: Least Squares Optimization with L1-Norm Regularization. University of British Columbia (2005)
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3D points. In: CVPR Workshop, San Fransisco, CA (June 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Azary, S., Savakis, A. (2012). 3D Action Classification Using Sparse Spatio-temporal Feature Representations. In: Bebis, G., et al. Advances in Visual Computing. ISVC 2012. Lecture Notes in Computer Science, vol 7432. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33191-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-33191-6_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33190-9
Online ISBN: 978-3-642-33191-6
eBook Packages: Computer ScienceComputer Science (R0)