Exploiting Spatio-temporal Constraints for Robust 2D Pose Tracking

  • Grégory Rogez
  • Ignasi Rius
  • Jesús Martínez-del-Rincón
  • Carlos Orrite
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4814)


We present a Spatio-temporal 2D Models Framework (STMF) for 2D-Pose tracking. Space and time are discretized and a mixture of probabilistic “local models” is learnt associating 2D Shapes and 2D Stick Figures. Those spatio-temporal models generalize well for a particular viewpoint and state of the tracked action but some spatio-temporal discontinuities can appear along a sequence, as a direct consequence of the discretization. To overcome the problem, we propose to apply a Rao-Blackwellized Particle Filter (RBPF) in the 2D-Pose eigenspace, thus interpolating unseen data between view-based clusters. The fitness to the images of the predicted 2D-Poses is evaluated combining our STMF with spatio-temporal constraints. A robust, fast and smooth human motion tracker is obtained by tracking only the few most important dimensions of the state space and by refining deterministically with our STMF.


Particle Filter Tracking Framework Training View Silhouette Extraction Monocular Image Sequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Bowden, R., Mitchell, T.A., Sarhadi, M.: Non-linear statistical models for the 3d reconstruction of human pose and motion from monocular image sequences. Image Vision Comput. 18(9), 729–737 (2000)CrossRefGoogle Scholar
  2. 2.
    Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. IJCV 61(2), 185–205 (2005)CrossRefGoogle Scholar
  3. 3.
    Doucet, A., de Freitas, N., Murphy, K., Russell, S.: Rao-blackwellised particle filtering for dynamic bayesian networks. In: Conf. on Uncertainty in Artif. Int. (2000)Google Scholar
  4. 4.
    Grauman, K., Shakhnarovich, G., Darrell, T.: Inferring 3D Structure with a Statistical Image-Based Shape Model. In: ICCV, pp. 641–648 (2003)Google Scholar
  5. 5.
    Sigal, L., Black, M.J.: Humaneva: Synchronized video and motion capture dataset for evaluation of articulated human motion (2006)Google Scholar
  6. 6.
    Isard, M., Blake, A.: Condensation – conditional density propagation for visual tracking. IJCV 29(1), 5–28 (1998)CrossRefGoogle Scholar
  7. 7.
    Johansson, G.: Visual interpretation of biological motion and a model for its analysis. Percept. Psychophys. 73(2), 201–211 (1973)Google Scholar
  8. 8.
    Khan, Z., Balch, T., Dellaert, F.: A rao-blackwellized particle filter for eigentracking. In: CVPR (2004)Google Scholar
  9. 9.
    Lan, X., Huttenlocher, D.P.: A unified spatio-temporal articulated model for tracking. In: CVPR (1), pp. 722–729 (2004)Google Scholar
  10. 10.
    Lee, C.-S., Elgammal, A.M.: Simultaneous Inference of View and Body Pose using Torus Manifolds. In: ICPR (3), pp. 489–494 (2006)Google Scholar
  11. 11.
    Mori, G., Malik, J.: Recovering 3d human body configurations using shape contexts. IEEE Trans. on PAMI 28(7), 1052–1062 (2006)Google Scholar
  12. 12.
    Ning, H., Tan, T., Wang, L., Hu, W.: Kinematics-based tracking of human walking in monocular video sequences. Image Vision Comp. 22, 429–441 (2004)CrossRefGoogle Scholar
  13. 13.
    Rogez, G., Orrite, C., Martínez, J., Herrero, J.: Probabilistic Spatio-Temporal 2D-Model for Pedestrian Motion Analysis in Monocular Sequences. In: Conf. on Articulated Motion and Deformable Objects, pp. 175–184 (2006)Google Scholar
  14. 14.
    Rogez, G., Martínez-del-Rincón, J., Orrite, C.: Dealing with Non-linearity in Shape Modelling of Articulated Objects. In: Ib. Conf. on Pattern Recognition and Image Analysis, Part I. LNCS, vol. 4477, pp. 63–71 (2007)Google Scholar
  15. 15.
    Rogez, G.: Estimating Human Body Pose using Shape + Structure Models Warping. Tech. Report CVLab (2007)Google Scholar
  16. 16.
    Sidenbladh, H., Black, M.J., Sigal, L.: Implicit Probabilistic Models of Human Motion for Synthesis and Tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 784–800. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  17. 17.
    Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. CVIU 80(3), 349–363 (2000)zbMATHGoogle Scholar
  18. 18.
    Tipping, M.E., Bishop, C.M.: Mixtures of probabilistic principal component analysers. Neural Computation 11(2), 443–482 (1999)CrossRefGoogle Scholar
  19. 19.
    Urtasun, R., Fleet, D., Hertzmann, A., Fua, P.: Priors for people tracking from small training sets. In: ICCV (1), pp. 403–410 (2005)Google Scholar
  20. 20.
    Zhang, J., Collins, R., Liu, Y.: Representation and matching of articulated shapes. In: CVPR, pp. 342–349 (2004)Google Scholar
  21. 21.
    Zhang, J., Collins, R., Liu, Y.: Bayesian Body Localization Using Mixture of Nonlinear Shape Models. In: IEEE ICCV, pp. 725–732 (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Grégory Rogez
    • 1
  • Ignasi Rius
    • 2
  • Jesús Martínez-del-Rincón
    • 1
  • Carlos Orrite
    • 1
  1. 1.Computer Vision Lab, I3A, University of ZaragozaSpain
  2. 2.Computer Vision Center, UAB, BellaterraSpain

Personalised recommendations