Efficient Upper Body Pose Estimation from a Single Image or a Sequence

  • Matheen Siddiqui
  • Gérard Medioni
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4814)


We propose a method to find candidate 2D articulated model configurations by searching for locally optimal configurations under a weak but computationally manageable fitness function. This is accomplished by first parameterizing a tree structure by its joints. Candidate configurations can then efficiently and exhaustively be assembled in a bottom-up manner. Working from the leaves of the tree to its root, we maintain a list of locally optimal, yet sufficiently distinct candidate configurations for the body pose.

We then adapt this algorithm for use on a sequence of images by considering configurations that are either near their position in the previous frame or overlap areas of interest in subsequent frames. This way, the number of partial configurations generated and evaluated significantly reduces while both smooth and abrupt motions can be accommodated. This approach is validated on test and standard datasets.


Single Image Gesture Recognition Previous Frame Subsequent Frame Joint Location 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agarwal, A., Triggs, B.: 3d human pose from silhouettes by relevance vector regression. In: CVPR, vol. II, pp. 882–888 (2004)Google Scholar
  2. 2.
    Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: CVPR, Santa Barbara, CA, pp. 8–15 (June 1998)Google Scholar
  3. 3.
    Cohen, I., Li, H.: Inference of human postures by classification of 3d human body shape. In: AMFG, Nice, France, pp. 74–81 (2003)Google Scholar
  4. 4.
    Deutscher, J., Reid, I.: Articulated body motion capture by stochastic search. IJCV 61(2), 185–205 (2005)CrossRefGoogle Scholar
  5. 5.
    Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV 61(1), 55–79 (2005)CrossRefGoogle Scholar
  6. 6.
    Shakhnarovich, T.D.G., Viola, P.: Fast pose estimation with parameter sensitive hashing. In: CVPR, Madison, WI (June 2003)Google Scholar
  7. 7.
    Howe, N.R.: Evaluating lookup-based monocular human pose tracking on the humaneva test data. In: EHuM. Evaluation of Articulated Human Motion and Pose Estimation (2006)Google Scholar
  8. 8.
    Ju, S.X., Black, M.J., Yacoob, Y.: Cardboard people: A parameterized model of articulated image motion. In: Proc. of the 2nd Int. Conf. on Automatic Face and Gesture Recognition, pp. 38–44 (1996)Google Scholar
  9. 9.
    Lee, M.W., Cohen, I.: A model-based approach for estimating human 3d poses in static images. PAMI 29(6), 905–916 (2006)Google Scholar
  10. 10.
    Mori, G., Malik, J.: Recovering 3d human body configurations using shape contexts. PAMI 28(7), 1052–1062 (2006)Google Scholar
  11. 11.
    Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering human body configurations: combining segmentation and recognition. In: CVPR, Washington, DC, vol. II, pp. 326–333 (2004)Google Scholar
  12. 12.
    Morris, D.D., Rehg, J.M.: Singularity analysis for articulated object tracking. In: CVPR, Santa Barbara, CA (June 1998)Google Scholar
  13. 13.
    Ramanan, D., Forsyth, D.A.: Finding and tracking people from the bottom up. In: CVPR, Madison, WI, vol. II, pp. 467–474 (2003)Google Scholar
  14. 14.
    Siddiqui, M., Medioni, G.: Real time limb tracking with adaptive model selection. In: ICPR, pp. 770–773 (2006)Google Scholar
  15. 15.
    Sigal, L., Bhatia, S., Roth, S., Black, M.J., Isard, M.: Tracking loose-limbed people. In: CVPR, Washington, DC, vol. I, pp. 421–428 (2004)Google Scholar
  16. 16.
    Sigal, L., Isard, M., Sigelman, B.H., Black, M.J.: Attractive people: Assembling loose-limbed models using non-parametric belief propagation. In: NIPS (2004)Google Scholar
  17. 17.
    Sigal, B.M.J., Humaneva, L.: Synchronized video and motion capture dataset for evaluation of articulated human motion. Technical report, Brown University, CS-06-08 (2006)Google Scholar
  18. 18.
    Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3d body tracking. In: CVPR, Kauai, Hawaii (December 2001)Google Scholar
  19. 19.
    Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single uncalibrated image. CVIU 80(3), 349–363 (2000)zbMATHGoogle Scholar
  20. 20.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, Kauai, Hawaii (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Matheen Siddiqui
    • 1
  • Gérard Medioni
    • 1
  1. 1.Univ. of Southern Calif, 1010 Watt Way, Los Angeles, CA 

Personalised recommendations