Abstract
The goal of the paper is to develop a one-shot real-time learning and recognition system for 3D actions. We use RGBD images, combine motion and appearance cues, and map them into a new overcomplete space. The proposed method relies on descriptors based on 3D Histogram of Flow (3DHOF) and on Global Histogram of Oriented Gradient (GHOG); adaptive sparse coding (SC) is further applied to capture high-level patterns. We add effective on-line video segmentation and finally the recognition of actions through linear SVMs. The main contribution of the paper is a real-time system for one-shot action modeling; moreover we highlight the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on the ChaLearn Gesture Dataset and with a Kinect sensor.
Research supported by the European FP7 ICT project No. 270490 (EFAA) and project No. 270273 (Xperience).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys (2011)
Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing (2010)
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. TPAMI (2001)
Fanello, S.R.F., Gori, I., Pirri, F.: Arm-hand behaviours modelling: From attention to imitation. In: Bebis, G., et al. (eds.) ISVC 2010, Part II. LNCS, vol. 6454, pp. 616–627. Springer, Heidelberg (2010)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden markov model. In: CVPR (1992)
Natarajan, P., Nevatia, R.: Coupled hidden semi markov models for activity recognition. In: Workshop Motion and Video Computing (2007)
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: CVPRW (2010)
Wu, D., Zhu, F., Shao, L.: One shot learning gesture recognition from rgbd images. In: CVPR Workshop on Gesture Recognition (2012)
Lui, Y.M.: A least squares regression framework on manifolds and its application to gesture recognition. In: CVPR Workshop on Gesture Recognition (2012)
Seo, H., Milanfar, P.: Action recognition from one example. PAMI (2011)
Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2009)
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR (2004)
Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: CVPR (2007)
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS (2007)
ChaLearn: Chalearn gesture dataset, cgd 2011 (2011), http://gesture.chalearn.org/data
Guyon, I., Athitsos, V., Jangyodsuk, P., Hammer, B., Balderas, H.J.E.: Chalearn gesture challenge: Design and first results. In: CVPR Workshop on Gesture Recognition (2012)
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: CVPR (1999)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, Inc. (1998)
Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics - Doklady (1966)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoustics, Speech and Signal Processing (1978)
Microsoft: Kinect for windows (2011), http://kinectforwindows.org/
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from a single depth image. In: CVPR (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fanello, S.R., Gori, I., Metta, G., Odone, F. (2013). One-Shot Learning for Real-Time Action Recognition. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-38628-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)