Abstract
This paper presents a novel tool for detecting human actions in stationary surveillance camera videos. In the proposed method there is no need to detect and track the human body or to detect the spatial or spatio-temporal interest points of the events. Instead our method computes single-scale spatio-temporal descriptors to characterize the action patterns. Two different descriptors are evaluated: histograms of optical flow directions and histograms of frame difference gradients. The integral video method is also presented to improve the performance of the extraction of these features. We evaluated our methods on two datasets: a public dataset containing actions of persons drinking and a new dataset containing stand up events. According to our experiments both detectors are suitable for indoor applications and provide a robust tool for practical problems such as moving background, or partial occlusion.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Gavrila, D.M.: The visual analysis of human movement: a survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)
Song, Y., Goncalves, L., Perona, P.: Unsupervised learning of human motion. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(7), 814–827 (2003)
Rao, C., Shah, M.: View-invariance in action recognition. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 316–322 (2001)
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: Proc. of the 14th Int. Conf. on Computer Communications and Networks, pp. 65–72 (2005)
Laptev, I.: On space-time interest points. Int. J. of Computer Vision 64(2–3), 107–123 (2005)
Ballan, L., Bertini, M., Del Bimbo, A., Seidenari, L., Serra, G.: Recognizing human actions by fusing spatio-temporal appearance and motion descriptors. In: Proc. of the IEEE Int. Conf. on Image Processing (2009)
Schüldt, C., Laptev, I., Caputo, B.: Recognizing human actions: A local SVM approach. In: Proc. of the 17th Int. Conf. on Pattern Recognition, pp. 32–36 (2004)
Vapnik, V.N.: Statistical Learning Theory. Wiley Interscience, Hoboken (1998)
Kläser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: Proc. of the British Machine Vision Conference, pp. 995–1004 (2008)
Laptev, I., Marszałek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Efros, A.A., Berg, A.C., Mori, G., Malik, J.: Recognizing action at a distance. In: Proc. of the 9th IEEE Int. Conf. on Computer Vision, pp. 726–733 (2003)
Dalal, N., Triggs, B., Schmid, C.: Human detection using oriented histograms of flow and appearance. In: Proc. of the European Conf. on Computer Vision, pp. 7–13 (2006)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 886–893 (2005)
Laptev, I., Pérez, P.: Retrieving actions in movies. In: Proc. of the IEEE Int. Conf. on Computer Vision (2007)
Havasi, L., Szlávik, Z., Szirányi, T.: Higher order symmetry for non-linear classification of human walk detection. Pattern Recognition Letters 27(7), 822–829 (2006)
Galvin, B., McCane, B., Novins, K., Mason, D., Mills, S.: Recovering motion fields: An evaluation of eight optical flow algorithms. In: Proc. of the British Machine Vision Conference, pp. 195–204 (1998)
Proesmans, M., Van Gool, L.J., Pauwels, E.J., Oosterlinck, A.: Determination of optical flow and its discontinuities using non-linear diffusion. In: Proc. of the 3rd European Conf. on Computer Vision, pp. 295–304 (1994)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition, pp. 511–518 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Utasi, Á., Kovács, A. (2010). Recognizing Human Actions by Using Spatio-temporal Motion Descriptors. In: Blanc-Talon, J., Bone, D., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2010. Lecture Notes in Computer Science, vol 6475. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17691-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-17691-3_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17690-6
Online ISBN: 978-3-642-17691-3
eBook Packages: Computer ScienceComputer Science (R0)