Advertisement

Video-Based Human Motion Estimation by Part-Whole Gait Manifold Learning

  • Guoliang Fan
  • Xin Zhang
Part of the Advances in Pattern Recognition book series (ACVPR)

Abstract

This chapter presents a general gait representation framework for video-based human motion estimation that involves gait modeling at both the whole and part levels. Our goal is to estimate the kinematics of an unknown gait from image sequences taken by a single camera. This approach involves two generative models, called the kinematic gait generative model (KGGM) and the visual gait generative model (VGGM), which represent the kinematics and appearances of a gait by a few latent variables, respectively. Particularly, the concept of gait manifold is proposed to capture the gait variability among different individuals by which KGGM and VGGM can be integrated together for gait estimation, so that a new gait with unknown kinematics can be inferred from gait appearances via KGGM and VGGM. A key issue in generating a gait manifold is the definition of the distance function that reflects the dissimilarity between two individual gaits. Specifically, we investigate and compare three distance functions each of which leads to a specific gait manifold. Moreover, we extend our gait modeling framework from the whole level to the part level by decomposing a gait into two parts, an upper-body gait and a lower-body gait, each of which is associated with a specific gait manifold for part level gait modeling. Also, a two-stage inference algorithm is employed for whole-part gait estimation. The proposed algorithms were trained on the CMU Mocap data and tested on the HumanEva data, and the experiment results show promising results compared with the state-of-the-art algorithms with similar experimental settings.

Keywords

Motion Estimation Scale Invariant Feature Transform Inference Algorithm Training Gait Torus Manifold 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work is supported by the National Science Foundation (NSF) under Grant IIS-0347613 and an OHRS award (HR09-030) from the Oklahoma Center for the Advancement of Science and Technology (OCAST).

References

  1. 1.
    Agarwal, A., Triggs, B.: 3D human pose from silhouettes by relevance vector regression. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  2. 2.
    Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28, 44–58 (2006) CrossRefGoogle Scholar
  3. 3.
    Anguelov, D., Srinivasan, P., Koller, D., Thrun, S., Rodgers, J.: SCAPE: Shape completion and animation of people. ACM Trans. Graph. 24, 408–416 (2005) CrossRefGoogle Scholar
  4. 4.
    Balan, A.O., Sigal, L., Black, M.J., Davis, J.E., Haussecker, H.W.: Detailed human shape and pose from images. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  5. 5.
    Bo, L., Sminchisescu, C., Kanaujia, A., Metaxas, D.: Fast algorithms for large scale conditional 3D prediction. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  6. 6.
    Brubaker, M., Fleet, D.: The kneed walker for human pose tracking. In: Proc. IEEE Conference Computer Vision and Pattern Recognition (2008) Google Scholar
  7. 7.
    Brubaker, M., Fleet, D., Hertzmann, A.: Physics-based human pose tracking. In: Proc. NIPS Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2006) Google Scholar
  8. 8.
    Brubaker, M., Fleet, D., Hertzmann, A.: Physics-based person tracking using simplified lower-body dynamics. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  9. 9.
    Canton-Ferrer, C., Casas, J., Pardas, M.: Exploiting structural hierarchy in articulated objects towards robust motion capture. In: Conference on Articulated Motion and Deformable Objects (2008) Google Scholar
  10. 10.
    Cheng, S.Y., Trivedi, M.M.: Articulated human body pose inference from voxel data using a kinematically constrained Gaussian mixture model. In: Proc. CVPR 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2007) Google Scholar
  11. 11.
    CMU Human Motion Capture Database. Available at http://mocap.cs.cmu.edu
  12. 12.
    Cunado, D., Nixon, M.S., Carter, J.N.: Automatic extraction and description of human gait models for recognition purposes. Comput. Vis. Image Underst. 90, 1–41 (2003) CrossRefGoogle Scholar
  13. 13.
    Ek, C.H., Torr, P., Lawrence, N.: Gaussian process latent variable models for human pose estimation. In: Proc. Machine Learning and Multimodal Interaction (2007) Google Scholar
  14. 14.
    Elgammal, A., Lee, C.S.: Inferring 3D body pose from silhouettes using activity manifold learning. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 681–688 (2004) Google Scholar
  15. 15.
    Elgammal, A., Lee, C.S.: Separating style and content on a nonlinear manifold. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  16. 16.
    Elgammal, A., Lee, C.S.: Tracking people on torus. IEEE Trans. Pattern Anal. Mach. Intell. 31, 520–538 (2009) CrossRefGoogle Scholar
  17. 17.
    Elgammal, A., Harwood, D., Davis, L.: Non-parametric model for background subtraction. In: Proc. European Conference on Computer Vision (2000) Google Scholar
  18. 18.
    Gall, J., Rosenhahn, B., Brox, T., Seidel, H.P.: Optimization and filtering for human motion capture—a multi-layer framework. Int. J. Comput. Vis. 87(1–2), 75–92 (2010). doi: 10.1007/s11263-008-0173-1 CrossRefGoogle Scholar
  19. 19.
    Guo, F., Qian, G.: Monocular 3D tracking of articulated human motion in silhouette and pose manifolds. EURASIP J. Image Video Process. 2008, 1–18 (2008) CrossRefGoogle Scholar
  20. 20.
    Gupta, A., Chen, T., Chen, F., Kimber, D., Daivs, L.: Context and observation driven latent variable model for human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  21. 21.
    Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE Trans. Pattern Anal. Mach. Intell. 28(2), 316–322 (2006) CrossRefGoogle Scholar
  22. 22.
    Howe, N.R.: Recognition-based motion capture and the HumanEva II test data. In: Proc. CVPR 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2007) Google Scholar
  23. 23.
    Husz, Z.L., Wallace, A., Green, P.: Evaluation of a hierarchical partitioned particle filter with action primitives. In: Proc. CVPR 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2007) Google Scholar
  24. 24.
  25. 25.
    Jaeggli, T., Koller-Meier, E., Gool, L.V.: Multi-activity tracking in LLE body pose space. In: Proc. International Conference on Computer Vision 2nd Workshop on Human Motion (2007) Google Scholar
  26. 26.
    Kanaujia, A., Sminchisescu, C., Metaxas, D.: Semi-supervised hierarchical models for 3D human pose reconstruction. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  27. 27.
    Lan, X., Huttenlocher, D.: A unified spatio-temporal articulated model for tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  28. 28.
    Lan, X., Huttenlocher, D.: Beyond trees: common-factor models for 2D human pose recovery. In: Proc. IEEE International Conference on Computer Vision (2005) Google Scholar
  29. 29.
    Lawrence, N.: Gaussian process latent variable models for visualization of high dimensional data. In: Advances in Neural Information Processing. MIT Press, Cambridge (2003) Google Scholar
  30. 30.
    Lawrence, N., Candela, J.: Local distance preservation in the GPLVM through back constraints. In: International Conference on Machine Learning (2006) Google Scholar
  31. 31.
    Lee, M.W., Cohen, I.: Proposal maps driven MCMC for estimating human body pose in static images. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  32. 32.
    Lee, C.S., Elgammal, A.: Body pose tracking from uncalibrated camera using supervised manifold learning. In: Proc. of NIPS Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2006) Google Scholar
  33. 33.
    Lee, C.S., Elgammal, A.: Simultaneous inference of view and body pose using torus manifolds. In: Proc. International Conference on Pattern Recognition (2006) Google Scholar
  34. 34.
    Lee, C.S., Elgammal, A.: Modeling view and posture manifolds for tracking. In: Proc. IEEE International Conference on Computer Vision (2007) Google Scholar
  35. 35.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60, 91–110 (2004) CrossRefGoogle Scholar
  36. 36.
    Moeslund, T.B., Hilton, A., Kruger, V.: A survey of advances in vision-based human motion capture and analysis. Comput. Vis. Image Underst. 104, 90–126 (2006) CrossRefGoogle Scholar
  37. 37.
    Monzani, J.S., Baerlocher, P., Boulic, R., Thalmann, D.: Using an intermediate skeleton and inverse kinematics for motion retargeting. Comput. Graph. Forum 19, 11–19 (2000) CrossRefGoogle Scholar
  38. 38.
    Moon, K., Pavlovic, V.: Impact of dynamics on subspace embedding and tracking of sequences. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2006) Google Scholar
  39. 39.
    Mori, G., Malik, J.: Recovering 3D human body configurations using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1052–1062 (2006) CrossRefGoogle Scholar
  40. 40.
    Mudermann, L., Corazza, S., Andriacchi, T.P.: The evolution of methods for the capture of human movement leading to markerless motion capture for biomechanical applications. J. Neuroeng. Rehabil. 3 (2006). doi: 10.1186/1743-0003-3-6
  41. 41.
    Mudermann, L., Corazza, S., Andriacchi, T.P.: Markerless human motion capture through visual hull and articulated ICP. In: Proc. NIPS Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2006) Google Scholar
  42. 42.
    Mundermann, L., Corazza, S., Andriacchi, T.P.: Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2007) Google Scholar
  43. 43.
    Ni, B., Kassim, A.A., Winkler, S.: A hybrid framework for 3D human motion tracking. IEEE Trans. Circuits Syst. Video Technol. 18, 1075–1084 (2008) CrossRefGoogle Scholar
  44. 44.
    Ning, H., Tan, T., Wang, L., Hu, W.: Kinematics-based tracking of human walking in monocular video sequences. Image Vis. Comput. 22, 429–441 (2004) CrossRefGoogle Scholar
  45. 45.
    Ning, H., Xu, W., Gong, Y., Huang, T.: Discriminative learning of visual words for 3D human pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  46. 46.
    Okada, R., Soatto, S.: Relevant feature selection for human pose estimation and localization in cluttered images. In: Proc. European Conference on Computer Vision (2008) Google Scholar
  47. 47.
    Ong, E.J., Micilotta, A., Bowden, R., Hilton, A.: Viewpoint invariant exemplar-based 3D human tracking. Comput. Vis. Image Underst. 104, 178–189 (2006) CrossRefGoogle Scholar
  48. 48.
    Peurum, P., Venkatesh, S., West, G.: A study on smoothing for particle-filtered 3D human body tracking. Int. J. Comput. Vis. 87, 53–74 (2010). doi: 10.1007/s11263-009-0205-5 CrossRefGoogle Scholar
  49. 49.
    Poppe, R.: Evaluating example-based pose estimation: experiments on the HumanEva set. In: Proc. CVPR 2nd Workshop on Evaluation of Articulated Human Motion and Pose Estimation (2007) Google Scholar
  50. 50.
    Poppe, R.: Vision-based human motion analysis: an overview. Comput. Vis. Image Underst. 108, 4–18 (2007) CrossRefGoogle Scholar
  51. 51.
    Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: tracking people by finding stylized poses. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2005) Google Scholar
  52. 52.
    Rogez, G., Rihan, J., Ramalingam, S., Orrite, C., Torr, P.H.: Randomized trees for human pose detection. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  53. 53.
    Rosales, R., Sclaroff, S.: Estimating 3D body pose using uncalibrated cameras. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2001) Google Scholar
  54. 54.
    Rosenhahn, B., Schmaltz, C., Brox, T.: Markerless motion capture of man-machine interaction. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  55. 55.
    Roweis, S., Saul, L.: Nonlinear dimensionality reduction by locally linear embedding. Science 290, 2323–2326 (2000) CrossRefGoogle Scholar
  56. 56.
    Sigal, L., Black, M.: HumanEva: Synchronized video and motion capture dataset for evaluation of articulated human motion. Technical Report CS-06-08, Brown University (2006) Google Scholar
  57. 57.
    Sigal, L., Black, M.J.: Measure locally, reason globally: occlusion-sensitive articulated pose estimation. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2006) Google Scholar
  58. 58.
    Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  59. 59.
    Sigal, L., Balan, A., Black, M.J.: Combined discriminative and generative articulated pose and non-rigid shape estimation. In: Advances in Neural Information Processing Systems, pp. 1337–1344. MIT Press, Cambridge (2007) Google Scholar
  60. 60.
    Sigal, L., Memisevic, R., Fleet, D.J.: Shared kernel information embedding for discriminative inference. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2009) Google Scholar
  61. 61.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.: Learning joint top–down and bottom–up processes for 3D visual inference. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar
  62. 62.
    Sminchisescu, C., Kanaujia, A., Metaxas, D.N.: BM3E: Discriminative density propagation for visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 29(11), 2030–2044 (2007) CrossRefGoogle Scholar
  63. 63.
    Tangkuampien, T., Suter, D.: Real-time human pose inference using kernel principal component pre-image approximations. In: Proc. British Machine Vision Conference (2006) Google Scholar
  64. 64.
    Tenenbaum, J.B., de Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290, 2319–2323 (2000) CrossRefGoogle Scholar
  65. 65.
    Tian, T.P., Li, R., Sclaroff, S.: Articulated pose estimation in a learned smooth space of feasible solutions. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2005) Google Scholar
  66. 66.
    Urtansun, R.: Motion model for robust 3D human body tracking. Ph.D. Thesis, EPFL (2006) Google Scholar
  67. 67.
    Urtasun, R., Darrell, T.: Sparse probabilistic regression for activity-independent human pose inference. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  68. 68.
    Urtasun, R., Fleet, D., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: Proc. IEEE International Conference on Computer Vision (2005) Google Scholar
  69. 69.
    Vasilescu, M.A.O., Terzopoulos, D.: Multilinear subspace analysis of image ensembles. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition, pp. 93–99 (2003) Google Scholar
  70. 70.
    vanBeck, P.J.L.: Edge-based image representation and coding. Ph.D. Thesis, Delft University of Technology, the Netherlands (1995) Google Scholar
  71. 71.
    Vondrak, M., Sigal, L., Jenkins, O.C.: Physical simulation for probabilistic motion tracking. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2008) Google Scholar
  72. 72.
    Wang, J., Fleet, D., Hertzmann, A.: Gaussian process dynamic models. In: Advances in Neural Information Processing Systems. MIT Press, Cambridge (2006) Google Scholar
  73. 73.
    Xu, X., Li, B.: Learning motion correlation for tracking articulated human body with a Rao-Blackwellised particle filter. In: Proc. IEEE International Conference on Computer Vision (2007) Google Scholar
  74. 74.
    Yam, C., Nixon, M.S., Carter, J.N.: Automated person recognition by walking and running via model-based approaches. Pattern Recognit. 37, 1057–1072 (2004) CrossRefGoogle Scholar
  75. 75.
    Zhao, T., Nevatia, R.: Tracking multiple humans in crowded environment. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (2004) Google Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

There are no affiliations available

Personalised recommendations