One-Shot Learning for Real-Time Action Recognition

Fanello, Sean Ryan; Gori, Ilaria; Metta, Giorgio; Odone, Francesca

doi:10.1007/978-3-642-38628-2_4

One-Shot Learning for Real-Time Action Recognition

Sean Ryan Fanello^19,20,
Ilaria Gori¹⁹,
Giorgio Metta¹⁹ &
…
Francesca Odone²⁰

Conference paper

2111 Accesses
20 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7887))

Abstract

The goal of the paper is to develop a one-shot real-time learning and recognition system for 3D actions. We use RGBD images, combine motion and appearance cues, and map them into a new overcomplete space. The proposed method relies on descriptors based on 3D Histogram of Flow (3DHOF) and on Global Histogram of Oriented Gradient (GHOG); adaptive sparse coding (SC) is further applied to capture high-level patterns. We add effective on-line video segmentation and finally the recognition of actions through linear SVMs. The main contribution of the paper is a real-time system for one-shot action modeling; moreover we highlight the effectiveness of sparse coding techniques to represent 3D actions. We obtain very good results on the ChaLearn Gesture Dataset and with a Kinect sensor.

Research supported by the European FP7 ICT project No. 270490 (EFAA) and project No. 270273 (Xperience).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, J., Ryoo, M.: Human activity analysis: A review. ACM Computing Surveys (2011)
Google Scholar
Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing (2010)
Google Scholar
Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. TPAMI (2001)
Google Scholar
Fanello, S.R.F., Gori, I., Pirri, F.: Arm-hand behaviours modelling: From attention to imitation. In: Bebis, G., et al. (eds.) ISVC 2010, Part II. LNCS, vol. 6454, pp. 616–627. Springer, Heidelberg (2010)
Chapter Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: CVPR (2008)
Google Scholar
Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden markov model. In: CVPR (1992)
Google Scholar
Natarajan, P., Nevatia, R.: Coupled hidden semi markov models for activity recognition. In: Workshop Motion and Video Computing (2007)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: CVPRW (2010)
Google Scholar
Wu, D., Zhu, F., Shao, L.: One shot learning gesture recognition from rgbd images. In: CVPR Workshop on Gesture Recognition (2012)
Google Scholar
Lui, Y.M.: A least squares regression framework on manifolds and its application to gesture recognition. In: CVPR Workshop on Gesture Recognition (2012)
Google Scholar
Seo, H., Milanfar, P.: Action recognition from one example. PAMI (2011)
Google Scholar
Wang, H., Ullah, M.M., Kläser, A., Laptev, I., Schmid, C.: Evaluation of local spatio-temporal features for action recognition. In: BMVC (2009)
Google Scholar
Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: ICPR (2004)
Google Scholar
Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: CVPR (2007)
Google Scholar
Lee, H., Battle, A., Raina, R., Ng, A.Y.: Efficient sparse coding algorithms. In: NIPS (2007)
Google Scholar
ChaLearn: Chalearn gesture dataset, cgd 2011 (2011), http://gesture.chalearn.org/data
Guyon, I., Athitsos, V., Jangyodsuk, P., Hammer, B., Balderas, H.J.E.: Chalearn gesture challenge: Design and first results. In: CVPR Workshop on Gesture Recognition (2012)
Google Scholar
Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: CVPR (1999)
Google Scholar
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003)
Chapter Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR (2005)
Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, Inc. (1998)
Google Scholar
Levenshtein, V.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics - Doklady (1966)
Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. Acoustics, Speech and Signal Processing (1978)
Google Scholar
Microsoft: Kinect for windows (2011), http://kinectforwindows.org/
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from a single depth image. In: CVPR (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

iCub Facility, Istituto Italiano di Tecnologia, Italy
Sean Ryan Fanello, Ilaria Gori & Giorgio Metta
Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università degli Studi di Genova, Italy
Sean Ryan Fanello & Francesca Odone

Authors

Sean Ryan Fanello
View author publications
You can also search for this author in PubMed Google Scholar
Ilaria Gori
View author publications
You can also search for this author in PubMed Google Scholar
Giorgio Metta
View author publications
You can also search for this author in PubMed Google Scholar
Francesca Odone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Systems and Robotics, Instituto Superior Técnico, Portugal
João M. Sanches
University of Alicante, Spain
Luisa Micó
INESC and University of Porto, Porto, Portugal
Jaime S. Cardoso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fanello, S.R., Gori, I., Metta, G., Odone, F. (2013). One-Shot Learning for Real-Time Action Recognition. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-38628-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics