Estimating Human Body Configurations Using Shape Context Matching

  • Greg Mori
  • Jitendra Malik
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2352)


The problem we consider in this paper is to take a single two-dimensional image containing a human body, locate the joint positions, and use these to estimate the body configuration and pose in three-dimensional space. The basic approach is to store a number of exemplar 2D views of the human body in a variety of different configurations and viewpoints with respect to the camera. On each of these stored views, the locations of the body joints (left elbow, right knee, etc.) are manually marked and labelled for future use. The test shape is then matched to each stored view, using the technique of shape context matching in conjunction with a kinematic chain-based deformation model. Assuming that there is a stored view sufficiently similar in configuration and pose, the correspondence process will succeed. The locations of the body joints are then transferred from the exemplar view to the test shape. Given the joint locations, the 3D body configuration and pose are then estimated. We can apply this technique to video by treating each frame independently - tracking just becomes repeated recognition! We present results on a variety of datasets.


  1. 1.
    Gavrila, D.M.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding: CVIU 73 (1999) 82–98zbMATHCrossRefGoogle Scholar
  2. 2.
    O’Rourke, J., Badler, N.: Model-based image analysis of human motion using constraint propagation. IEEE Trans. PAMI 2 (1980) 522–536Google Scholar
  3. 3.
    Hogg, D.: Model-based vision: A program to see a walking person. Image and Vision Computing 1 (1983) 5–20CrossRefGoogle Scholar
  4. 4.
    Yamamoto, M., Koshikawa, K.: Human motion analysis based on a robot arm model. CVPR (1991) 664–665Google Scholar
  5. 5.
    Rehg, J., Kanade, T.: Visual tracking of high dof articulated structures: An application to human hand tracking. Proc. of 3rd ECCV II (1994) 35–46Google Scholar
  6. 6.
    Bregler, C., Malik, J.: Tracking people with twists and exponential maps. Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn. (1998) 8–15Google Scholar
  7. 7.
    Kakadiaris, I., Metaxas, D.: Model-based estimation of 3d human motion. IEEE Trans. PAMI 22 (2000) 1453–1459Google Scholar
  8. 8.
    Gavrila, D., Davis, L.: 3d model-based tracking of humans in action: A multi-view approach. IEEE Computer Society CVPR (1996) 73–80Google Scholar
  9. 9.
    Rohr, K.: Incremental recognition of pedestrians from image sequences. In: CVPR93. (1993) 8–13Google Scholar
  10. 10.
    Baumberg, A., Hogg, D.: Learning flexible models from image sequences. Lecture Notes in Computer Science 800 (1994) 299–308Google Scholar
  11. 11.
    Wren, C., Azarbayejani, A., Darrell, T., Pentland, A.: Pfinder: Real-time tracking of the human body. IEEE Trans. PAMI 19 (1997) 780–785Google Scholar
  12. 12.
    Morris, D., Rehg, J.: Singularity analysis for articulated object tracking. In: Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn. (1998) 289–296Google Scholar
  13. 13.
    Ioffe, S., Forsyth, D.: Human tracking with mixtures of trees. In: Proc. 8th Int. Conf. Computer Vision. Volume 1. (2001) 690–695Google Scholar
  14. 14.
    Song, Y., Goncalves, L., Perona, P.: Monocular perception of biological motion-clutter and partial occlusion. In: Proc. 6th Europ. Conf. Comput. Vision. (2000)Google Scholar
  15. 15.
    Toyama, K., Blake, A.: Probabilistic exemplar-based tracking in a metric space. In: Proc. 8th Int. Conf. Computer Vision. Volume 2. (2001) 50–57Google Scholar
  16. 16.
    Brand, M.: Shadow puppetry. Proc. 8th Int. Conf. Computer Vision (1999) 1237–1244Google Scholar
  17. 17.
    Carlsson, S.: Order structure, correspondence and shape based categories. In: Shape Contour and Grouping in Computer Vision. Springer LNCS 1681 (1999) 58–71CrossRefGoogle Scholar
  18. 18.
    Carlsson, S., Sullivan, J.: Action recognition by shape matching to key frames. Workshop on Models versus Exemplars in Computer Vision at CVPR (2001)Google Scholar
  19. 19.
    Mori, G., Malik, J.: Estimating human body configurations using shape context matching. Workshop on Models versus Exemplars in Computer Vision at CVPR (2001)Google Scholar
  20. 20.
    Belongie, S., Malik, J., Puzicha, J.: Matching shapes. In: Eighth IEEE International Conference on Computer Vision. Volume 1., Vancouver, Canada (2001) 454–461CrossRefGoogle Scholar
  21. 21.
    Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. PAMI (2002) (in press).Google Scholar
  22. 22.
    Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. CVIU 80 (2000) 349–363zbMATHGoogle Scholar
  23. 23.
    Canny, J.: A computational approach to edge detection. IEEE Trans. PAMI 8 (1986) 679–698Google Scholar
  24. 24.
    Papadimitriou, C., Stieglitz, K.: Combinatorial Optimization: Algorithms and Complexity. Prentice Hall (1982)Google Scholar
  25. 25.
    Jonker, R., Volgenant, A.: A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing 38 (1987) 325–340zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Amit, Y., Kong, A.: Graphical templates for model registration. IEEE Trans. PAMI (1996)Google Scholar
  27. 27.
    Eguchi, H.: Moving Pose 1223. Bijutsu Shuppan-sha (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Greg Mori
    • 1
  • Jitendra Malik
    • 1
  1. 1.Computer Science DivisionUniversity of California at BerkeleyBerkeley

Personalised recommendations