3D Human Motion Analysis in Monocular Video: Techniques and Challenges

Sminchisescu, Cristian

doi:10.1007/978-1-4020-6693-1_8

3D Human Motion Analysis in Monocular Video: Techniques and Challenges

Cristian Sminchisescu⁵

Chapter

2960 Accesses
13 Citations

Part of the book series: Computational Imaging and Vision ((CIVI,volume 36))

Extracting meaningful 3D human motion information from video sequences is of interest for applications like intelligent human–computer interfaces, biometrics, video browsing and indexing, virtual reality or video surveillance. Analyzing videos of humans in unconstrained environments is an open and currently active research problem, facing outstanding scientific and computational challenges. The proportions of the human body vary largely across individuals, due to gender, age, weight or race. Aside from this variability, any single human body has many degrees of freedom due to articulation and the individual limbs are deformable due to moving muscle and clothing. Finally, real-world events involve multiple interacting humans occluded by each other or by other objects and the scene conditions may also vary due to camera motion or lighting changes. All these factors make appropriate models of human structure, motion and action difficult to construct and difficult to estimate from images. In this chapter we give an overview of the problem of reconstructing 3D human motion using sequences of images acquired with a single video camera. We explain the difficulties involved, discuss ways to address them using generative and discriminative models and speculate on open problems and future research directions.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

CMU Human Motion Capture DataBase. Available online at http://mocap.cs. cmu.edu/search.html, 2003.
Agarwal A. and Triggs B. Monocular human motion capture with a mixture of regressors. In Workshop on Vision for Human Computer Interaction, 2005.
Google Scholar
Allen B., Curless B., and Popovic Z. The space of human body shapes: recon-struction and parameterization from range scans. In SIGGRAPH, 2003.
Google Scholar
Belkin M. and Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In Advances in Neural Information Processing Sys-tems, 2002.
Google Scholar
Bertero M., Poggio T., and Torre V. Ill-posed Problems in Early Vision. Proc. of IEEE, 1988.
Google Scholar
Bishop C. and Svensen M. Bayesian mixtures of experts. In Uncertainty in Artificial Intelligence, 2003.
Google Scholar
Blake A. and Isard M. Active Contours. Springer, 2000.
Google Scholar
Brand M. Shadow Puppetry. In IEEE International Conference on Computer Vision, pp. 1237-44, 1999.
Google Scholar
Bregler C. and Malik J. Tracking People with Twists and Exponential Maps. In IEEE International Conference on Computer Vision and Pattern Recognition, 1998.
Google Scholar
Carranza J., Theobalt C., Magnor M., and Seidel H.-P. Free-viewpoint video of human actors. In SIGGRAPH, 2003.
Google Scholar
Cham T. and Rehg J. A Multiple Hypothesis Approach to Figure Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 2, pp. 239-245, 1999.
Google Scholar
Choo K. and Fleet D. People Tracking Using Hybrid Monte Carlo Filtering. In IEEE International Conference on Computer Vision, 2001.
Google Scholar
Deutscher J., Blake A., and Reid I. Articulated Body Motion Capture by An-nealed Particle Filtering. In IEEE International Conference on Computer Vision and Pattern Recognition, 2000.
Google Scholar
Donoho D. and Grimes C. Hessian Eigenmaps: Locally Linear Embedding Tech-niques for High-dimensional Data. Proceeding of the National Acadamy of Arts and Sciences, 2003.
Google Scholar
Donoho D. and Grimes C. When Does ISOMAP Recover the Natural Parameter-ization of Families of Articulated Images? Technical report, Dept. of Statistics, Stanford University, 2003.
Google Scholar
Drummond T. and Cipolla R. Real-time Tracking of Highly Articulated Struc-tures in the Presence of Noisy Measurements. In IEEE International Conference on Computer Vision, 2001.
Google Scholar
Duane S., Kennedy A.D., Pendleton B.J., and Roweth D. Hybrid Monte Carlo. Physics Letters B, 195(2): 216-222, 1987.
Article Google Scholar
Elgammal A. and Lee C. Inferring 3d body pose from silhouettes using activity manifold learning. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Google Scholar
Gavrila D. The Visual Analysis of Human Movement: A Survey. Computer Vision and Image Understanding, 73(1):82-98, 1999.
Article MATH Google Scholar
Gordon N., Salmond D., and Smith A. Novel Approach to Non-linear/Non-Gaussian State Estimation. IEE Proceedings F, 1993.
Google Scholar
Howe N., Leventon M., and Freeman W. Bayesian Reconstruction of 3D Human Motion from Single-Camera Video. Advances in Neural Information Processing Systems, 1999.
Google Scholar
Isard M. and Blake A. A Smoothing Filter for CONDENSATION. In European Conference on Computer Vision, 1998.
Google Scholar
Isard M. and Blake A. CONDENSATION - Conditional Density Propagation for Visual Tracking. International Journal of Computer Vision, 1998.
Google Scholar
Isard M. and Blake A. Icondensation: Unifying low-level and high-level tracking in a stochastic framework. In European Conference on Computer Vision, 1998.
Google Scholar
Jordan M. Learning in Graphical Models. MIT Press, 1998.
Google Scholar
Kakadiaris I. and Metaxas D. Model-Based Estimation of 3D Human Motion with Occlusion Prediction Based on Active Multi-Viewpoint Selection. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 81-87,1996.
Google Scholar
Kehl R., Bray M., and Gool L.V. Full body tracking from multiple views using stochastic sampling. In IEEE International Conference on Computer Vision and Pattern Recognition, 2005.
Google Scholar
Lan X. and Huttenlocher D. Beyond trees: common factor models for 2d human pose recovery. In IEEE International Conference on Computer Vision, 2005.
Google Scholar
Lee H.J. and Chen Z.. Determination of 3D Human Body Postures from a Single View. Computer Vision, Graphics and Image Processing, 30:148-168, 1985.
Article MathSciNet Google Scholar
Lee M. and Cohen I. Proposal maps driven mcmc for estimating human body pose in static images. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Google Scholar
Li R., Yang M., Sclaroff S., and Tian T. Monocular Tracking of 3D Human Mo-tion with a Coordianted Mixture of Factor Analyzers. In European Conference on Computer Vision, 2006.
Google Scholar
Mackay D. Bayesian Interpolation. Neural Computation, 4(5):720-736, 1992.
Article Google Scholar
McCallum A., Freitag D., and Pereira F. Maximum entropy Markov models for information extraction and segmentation. In International Conference on Machine Learning, 2000.
Google Scholar
Mori G. and Malik J. Estimating Human Body Configurations Using Shape Context Matching. In European Conference on Computer Vision, 2002.
Google Scholar
Neal R. Annealed Importance Sampling. Statistics and Computing, 11:125-139, 2001.
Article MathSciNet Google Scholar
Ramanan D. and Sminchisescu C. Training Deformable Models for Localization. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.
Google Scholar
Rosales R. and Sclaroff S. Learning Body Pose Via Specialized Maps. In Ad- vances in Neural Information Processing Systems, 2002.
Google Scholar
Roth S., Sigal L., and Black M. Gibbs Likelihoods for Bayesian Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Google Scholar
Roweis S. and Saul L. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000.
Google Scholar
Schölkopf B., Smola A. and Müller K. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 10:1299-1319, 1998.
Article Google Scholar
Shakhnarovich G., Viola P., and Darrell T. Fast Pose Estimation with Parameter Sensitive Hashing. In IEEE International Conference on Computer Vision, 2003.
Google Scholar
Sidenbladh H. and Black M. Learning Image Statistics for Bayesian Tracking. In IEEE International Conference on Computer Vision, 2001.
Google Scholar
Sidenbladh H., Black M., and Fleet D. Stochastic Tracking of 3D Human Figures Using 2D Image Motion. In European Conference on Computer Vision, 2000.
Google Scholar
Sigal L., Bhatia S., Roth S., Black M., and Isard M. Tracking Loose-limbed People. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Google Scholar
Sminchisescu C. Consistency and Coupling in Human Model Likelihoods. In IEEE International Conference on Automatic Face and Gesture Recognition, pages 27-32, Washington DC, 2002.
Google Scholar
Sminchisescu C. and Jepson A. Density propagation for continuous temporal chains. Generative and discriminative models. Technical Report CSRG-401, University of Toronto, October 2004.
Google Scholar
Sminchisescu C. and Jepson A. Generative modelling for Continuous Non-Linearly Embedded Visual Inference. In International Conference on Machine Learning, pp. 759-766, Banff, 2004.
Google Scholar
Sminchisescu C. and Jepson A. Variational Mixture Smoothing for Non-Linear Dynamical Systems. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 2, pp. 608-615, Washington DC, 2004.
Google Scholar
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Learning to reconstruct 3D human motion from Bayesian mixtures of experts. A probabilistic discriminative approach. Technical Report CSRG-502, University of Toronto, October, 2004.
Google Scholar
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Conditional models for contextual human motion recognition. In IEEE International Conference on Computer Vision, vol 2, pp. 1808-1815, 2005.
Google Scholar
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Discriminative Density Propagation for 3D Human Motion Estimation. In IEEE International Confer-ence on Computer Vision and Pattern Recognition, vol 1, pp. 390-397, 2005.
Google Scholar
Sminchisescu C., Kanaujia A. and Metaxas D. BM ³ E : Discriminative Density Propagation for Visual Tracking. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
Google Scholar
Sminchisescu C., Kanaujia A., and Metaxas D. Learning Joint Top-down and Bottom-up Processes for 3D Visual Inference. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.
Google Scholar
Sminchisescu C. and Triggs B. Building Roadmaps of Local Minima of Vi-sual Models. In European Conference on Computer Vision, vol 1, pp. 566-582, Copenhagen, 2002.
Google Scholar
Sminchisescu C. and Triggs B. Hyperdynamics Importance Sampling. In Euro-pean Conference on Computer Vision, vol 1, pp. 769-783, Copenhagen, 2002.
Google Scholar
Sminchisescu C. and Triggs B. Estimating Articulated Human Motion with Covariance Scaled Sampling. International Journal of Robotics Research, 22 (6):371-393, 2003.
Article Google Scholar
Sminchisescu C. and Triggs B. Kinematic Jump Processes for Monocular 3D Human Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 1, pp. 69-76, Madison, 2003.
Google Scholar
Sminchisescu C. and Welling M. Generalized Darting Monte-Carlo. In 9th International Conference on Artificial Intelligence and Statistics, 2007.
Google Scholar
Sudderth E., Ihler A., Freeman W., and Wilsky A. Non-parametric belief prop-agation. In IEEE International Conference on Computer Vision and Pattern Recognition, 2003.
Google Scholar
Taylor C.J. Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 677-684, 2000.
Google Scholar
Tenenbaum J., Silva V., and Langford J. A Global Geometric Framewok for Nonlinear Dimensionality Reduction. Science, 2000.
Google Scholar
.Tipping M. Sparse Bayesian learning and the Relevance Vector Machine. Jour- nal of Machine Learning Research, 2001.
Google Scholar
Tomasi C., Petrov S., and Sastry A. 3d tracking = classification + interpolation. In IEEE International Conference on Computer Vision, 2003.
Google Scholar
Urtasun R., Fleet D., Hertzmann A., and Fua P. Priors for people tracking in small training sets. In IEEE International Conference on Computer Vision, 2005.
Google Scholar
Wachter S. and Nagel H. Tracking Persons in Monocular Image Sequences. Computer Vision and Image Understanding, 74(3):174-192, 1999.
Article Google Scholar
Waterhouse S., Mackay D., and Robinson T. Bayesian Methods for Mixtures of Experts. In Advances in Neural Information Processing Systems, 1996.
Google Scholar
Weston J., Chapelle O., Elisseeff A., Schölkopf B., and Vapnik V. Kernel De-pendency Estimation. In Advances in Neural Information Processing Systems, 2002.
Google Scholar
Wipf D., Palmer J., and Rao B. Perspectives on Sparse Bayesian Learning. In Advances in Neural Information Processing Systems, 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

TTI-C, University of Chicago Press, 1427 East 60th Street, 60637, Chicago, IL, USA
Cristian Sminchisescu

Authors

Cristian Sminchisescu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Max-Planck Institute for Computer Science, Stuhlsatzhausenweg 85, D-66123, Saarbrücken, Germany
Bodo Rosenhahn
The University of Auckland, New Zealand
Reinhard Klette
Rutgers University, Piscataway, NJ, USA
Dimitris Metaxas

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sminchisescu, C. (2008). 3D Human Motion Analysis in Monocular Video: Techniques and Challenges. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds) Human Motion. Computational Imaging and Vision, vol 36. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6693-1_8

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6693-1_8
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6692-4
Online ISBN: 978-1-4020-6693-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Buying options