Abstract
We examine the problem of classifying action sequences given a small set of examples for each type of action. Based on the presumption that human motion resides in a low dimensional space, we introduce a probabilistic dimensionality reduction model able to recover the structure of a low-dimensional manifold where all the involved actions reside. Requiring that sequences of the same action are placed apart from other sequences, we are able to achieve higher classification rates, with respect to other commonly used techniques, by performing the classification on this manifold. The main contribution is the introduction of a new model, based on Back-constrained GP-LVM which can be used for the efficient classification of sequences. We compare our method with the classification based on the Dynamic Time Warping distance and with the V-GPDS model, adapted for classification. Results are provided for sequences taken from two publicly available datasets which highlight different aspects of the method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, J.K., Cai, Q.: Human motion analysis: a review. Comput. Vis. Image Underst. 73(3), 428–440 (1999)
Bahlmann, C., Haasdonk, B., Burkhardt, H.: On-line handwriting recognition  with support vector machines-a kernel approach. In: International Workshop on Frontiers in  Handwriting Recognition, pp. 49–54 (2002)
Baisero, A., Pokorny, F.T., Kragic, D., Ek, C.H.: The path kernel. In: Proceedings of the International Conference on Pattern Recognition Applications and Methods (2013)
CMU: Carnegie-mellon mocap database, http://mocap.cs.cmu.edu/ Â (2003)
Cuturi, M., Vert, J.P., Birkenes, O., Matsui, T.: A kernel for time series based on  global alignments. Comput. Res. Repos. (2006)
Damianou, A.C., Titsias, M.K., Lawrence, N.D.: Variational gaussian process  dynamical systems. In: Neural Information Processing Systems Conference, pp. 2510–2518 (2011)
Gong, D., Medioni, G.: Dynamic manifold warping for view invariant action recognition. In: International Conference on Computer Vision (2011)
Härdle, W., Simar, W.: Applied Multivariate Statistical Analysis. Springer, New York (2003)
Lawrence, N.D.: Gaussian process latent variable models for visualisation  of high dimensional data. In: Neural Information Processing Systems Conference (2003)
Lawrence, N.D., Candela, J.Q.: Local distance preservation in the gp-lvm  through back constraints. In: International Conference on Machine learning, pp. 513–520 (2006)
Li, Y., Fermüller, C., Aloimonos, Y., Ji, H.: Learning shift-invariant sparse  representation of actions. In: International Conference on Computer Vision and Pattern Recognition,  pp. 2630–2637 (2010)
Moeslund, T.B., Hilton, A., Krüger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104(2–3), 90–126 (2006)
Mordohai, P., Medioni, G.G.: Dimensionality estimation, manifold learning and function approximation using tensor voting. J. Mach. Learn. Res. 11, 411–450 (2010)
Müller, M.: Information Retrieval for Music and Motion. Springer, Heidelberg (2007)
Müller, M., Röder, T., Clausen, M.: Efficient content-based retrieval of  motion capture data. In: SIGGRAPH, pp. 677–685 (2005)
Muller, M., Roder, T., Clausen, M., Eberhardt, B., Krüger, B., Weber, A.: Documentation mocap database hdm05. Technical report CG-2007-2, Universität Bonn (2007)
Ntouskos, V., Papadakis, P., Pirri, F.: A comprehensive analysis of human  motion capture data for action recognition. In: Proceedings of the International Conference on  Computer Vision Theory and Applications, pp. 647–652 (2012)
Poggio, T.: Early vision: from computational structure to algorithms and parallel hardware. Comput. Vis. Graph. Image Process. 31(2), 139–155 (1985)
Rasmussen, C., Williams, C.: Gaussian processes for machine learning. Adaptive Computation and Machine Learning. MIT, Cambridge (2006)
Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Sheikh, Y., Sheikh, M., Shah, M.: Exploring the space of a human action. Int. Conf. Comput. Vis. 1, 144–149 (2005)
Shimodaira, H., Noma, K., Nakai, M., Sagayama, S.: Dynamic time-alignment kernel in support vector machine. Neural Inf. Process. Syst. Conf. 2, 921–928 (2001)
Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Neural Information Processing Systems Conference, pp. 1345–1352 (2006)
Tenenbaum, J.B., Silva, V.D., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science (2000)
Titsias, M.K., Lawrence, N.D.: Bayesian gaussian process latent variable model. J. Mach. Learn. Res. Proc. Track 9, 844–851 (2010)
Turaga, P.K., Chellappa, R., Subrahmanian, V.S., Udrea, O.: Machine recognition of human activities: a survey. IEEE Trans. Circuits Syst. Video Technol. 18(11), 1473–1488 (2008)
Urtasun, R., Darrell, T.: Discriminative gaussian process latent variable  model for classification. In: International Conference on Machine Learning, pp. 927–934 (2007)
Urtasun, R., Fleet, D.J., Fua, P.: 3d people tracking with gaussian process  dynamical models. In: International Conference on Computer Vision and Pattern Recognition, pp. 238–245 (2006)
Urtasun, R., Fleet, D.J., Geiger, A., Popovic, J., Darrell, T., Lawrence, N.D.:  Topologically-constrained latent variable models. In: International Conference on Machine Learning,  pp. 1080–1087 (2008)
Waltisberg, D., Yao, A., Gall, J., Van Gool, L.: Variations of a hough-voting  action recognition system. In: International conference on Pattern Recognition, pp. 306–312 (2010)
Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. Neural Inf. Proc. Syst. Conf. 18, 1441–1448 (2006)
Yao, A., Gall, J., Fanelli, G., Gool, L.V.: Does human action recognition benefit  from pose estimation? In: British Machine Vision Conference, pp. 67.1–67.11 (2011)
Yao, A., Gall, J., Gool, L.J.V.: A hough transform-based voting framework for  action recognition. In: International Conference on Computer Vision and Pattern Recognition, pp. 2061–2068 (2010)
Zhang, X., Fan, G.: Joint gait-pose manifold for video-based human motion estimation. In: European Conference on Computer Vision, pp. 47–54 (2011)
Acknowledgments
This paper describes research done under the EU-FP7 ICT 247870 NIFTi project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ntouskos, V., Papadakis, P., Pirri, F. (2015). Probabilistic Discriminative Dimensionality Reduction for Pose-Based Action Recognition. In: Fred, A., De Marsico, M. (eds) Pattern Recognition Applications and Methods. Advances in Intelligent Systems and Computing, vol 318. Springer, Cham. https://doi.org/10.1007/978-3-319-12610-4_9
Download citation
DOI: https://doi.org/10.1007/978-3-319-12610-4_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12609-8
Online ISBN: 978-3-319-12610-4
eBook Packages: EngineeringEngineering (R0)