Bottom-Up Recognition and Parsing of the Human Body

Srinivasan, Praveen; Shi, Jianbo

doi:10.1007/978-3-540-74198-5_13

Praveen Srinivasan¹ &
Jianbo Shi¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4679))

Included in the following conference series:

International Workshop on Energy Minimization Methods in Computer Vision and Pattern Recognition

1456 Accesses
3 Citations

Abstract

Recognizing humans, estimating their pose and segmenting their body parts are key to high-level image understanding. Because humans are highly articulated, the range of deformations they undergo makes this task extremely challenging. Previous methods have focused largely on heuristics or pairwise part models in approaching this problem. We propose a bottom-up growing, similar to parsing, of increasingly more complete partial body masks guided by a set of parse rules. At each level of the growing process, we evaluate the partial body masks directly via shape matching with exemplars (and also image features), without regard to how the hypotheses are formed. The body is evaluated as a whole, not the sum of its parts, unlike previous approaches. Multiple image segmentations are included at each of the levels of the growing/parsing, to augment existing hypotheses or to introduce ones. Our method yields both a pose estimate as well as a segmentation of the human. We demonstrate competitive results on this challenging task with relatively few training examples on a dataset of baseball players with wide pose variation. Our method is comparatively simple and could be easily extended to other objects. We also give a learning framework for parse ranking that allows us to keep fewer parses for similar performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Belongie, S., Malik, J., Puzicha, J.: Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. (2002)
Google Scholar
Borenstein, E., Malik, J.: Shape guided object segmentation. In: CVPR (2006)
Google Scholar
Cour, T., Benezit, F., Shi, J.: Spectral segmentation with multiscale graph decomposition. In: CVPR (2005)
Google Scholar
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. IJCV (2005)
Google Scholar
Hua, G., Yang, M.-H., Wu, Y.: Learning to estimate human pose with data driven belief propagation. In: CVPR (2005)
Google Scholar
Lee, M.W., Cohen, I.: Proposal maps driven mcmc for estimating human body pose in static images. CVPR (2004)
Google Scholar
Ling, H., Jacobs, D.W.: Using the inner-distance for classification of articulated shapes. CVPR (2005)
Google Scholar
Lowe, D.: Distinctive image features from scale-invariant keypoints. In: IJCV (2003)
Google Scholar
Martin, D.R., Fowlkes, C., Malik, J.: Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans. Pattern Anal. Mach. Intell. (2004)
Google Scholar
Mori, G.: Guiding model search using segmentation. In: ICCV (2005)
Google Scholar
Mori, G., Ren, X., Efros, A.A., Malik, J.: Recovering human body configurations: combining segmentation and recognition. In: CVPR (2004)
Google Scholar
Ramanan, D.: Learning to parse images of articulated bodies. In: NIPS (2007)
Google Scholar
Ramanan, D., Forsyth, D.A., Zisserman, A.: Strike a pose: Tracking people by finding stylized poses. In: CVPR (2005)
Google Scholar
Ren, X., Berg, A.C., Malik, J.: Recovering human body configurations using pairwise constraints between parts. In: ICCV (2005)
Google Scholar
Ronfard, R., Schmid, C., Triggs, B.: Learning to parse pictures of people. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, Springer, Heidelberg (2002)
Google Scholar
Sigal, L., Black, M.J.: Measure locally, reason globally: Occlusion-sensitive articulated pose estimation. In: CVPR (2006)
Google Scholar
Zhang, J., Luo, J., Collins, R., Liu, Y.: Body localization in still images using hierarchical models and hybrid search. In: CVPR (2006)
Google Scholar
Zhu, L., Yuille, A.: A hierarchical compositional system for rapid object detection. In: NIPS (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

GRASP Lab, University of Pennsylvania, 3330 Walnut Street Philadelphia, PA, 19104,
Praveen Srinivasan & Jianbo Shi

Authors

Praveen Srinivasan
View author publications
You can also search for this author in PubMed Google Scholar
Jianbo Shi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alan L. Yuille Song-Chun Zhu Daniel Cremers Yongtian Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Srinivasan, P., Shi, J. (2007). Bottom-Up Recognition and Parsing of the Human Body. In: Yuille, A.L., Zhu, SC., Cremers, D., Wang, Y. (eds) Energy Minimization Methods in Computer Vision and Pattern Recognition. EMMCVPR 2007. Lecture Notes in Computer Science, vol 4679. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74198-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-74198-5_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74195-4
Online ISBN: 978-3-540-74198-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics