Extracting meaningful 3D human motion information from video sequences is of interest for applications like intelligent human–computer interfaces, biometrics, video browsing and indexing, virtual reality or video surveillance. Analyzing videos of humans in unconstrained environments is an open and currently active research problem, facing outstanding scientific and computational challenges. The proportions of the human body vary largely across individuals, due to gender, age, weight or race. Aside from this variability, any single human body has many degrees of freedom due to articulation and the individual limbs are deformable due to moving muscle and clothing. Finally, real-world events involve multiple interacting humans occluded by each other or by other objects and the scene conditions may also vary due to camera motion or lighting changes. All these factors make appropriate models of human structure, motion and action difficult to construct and difficult to estimate from images. In this chapter we give an overview of the problem of reconstructing 3D human motion using sequences of images acquired with a single video camera. We explain the difficulties involved, discuss ways to address them using generative and discriminative models and speculate on open problems and future research directions.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
CMU Human Motion Capture DataBase. Available online at http://mocap.cs. cmu.edu/search.html, 2003.
Agarwal A. and Triggs B. Monocular human motion capture with a mixture of regressors. In Workshop on Vision for Human Computer Interaction, 2005.
Allen B., Curless B., and Popovic Z. The space of human body shapes: recon-struction and parameterization from range scans. In SIGGRAPH, 2003.
Belkin M. and Niyogi P. Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering. In Advances in Neural Information Processing Sys-tems, 2002.
Bertero M., Poggio T., and Torre V. Ill-posed Problems in Early Vision. Proc. of IEEE, 1988.
Bishop C. and Svensen M. Bayesian mixtures of experts. In Uncertainty in Artificial Intelligence, 2003.
Blake A. and Isard M. Active Contours. Springer, 2000.
Brand M. Shadow Puppetry. In IEEE International Conference on Computer Vision, pp. 1237-44, 1999.
Bregler C. and Malik J. Tracking People with Twists and Exponential Maps. In IEEE International Conference on Computer Vision and Pattern Recognition, 1998.
Carranza J., Theobalt C., Magnor M., and Seidel H.-P. Free-viewpoint video of human actors. In SIGGRAPH, 2003.
Cham T. and Rehg J. A Multiple Hypothesis Approach to Figure Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 2, pp. 239-245, 1999.
Choo K. and Fleet D. People Tracking Using Hybrid Monte Carlo Filtering. In IEEE International Conference on Computer Vision, 2001.
Deutscher J., Blake A., and Reid I. Articulated Body Motion Capture by An-nealed Particle Filtering. In IEEE International Conference on Computer Vision and Pattern Recognition, 2000.
Donoho D. and Grimes C. Hessian Eigenmaps: Locally Linear Embedding Tech-niques for High-dimensional Data. Proceeding of the National Acadamy of Arts and Sciences, 2003.
Donoho D. and Grimes C. When Does ISOMAP Recover the Natural Parameter-ization of Families of Articulated Images? Technical report, Dept. of Statistics, Stanford University, 2003.
Drummond T. and Cipolla R. Real-time Tracking of Highly Articulated Struc-tures in the Presence of Noisy Measurements. In IEEE International Conference on Computer Vision, 2001.
Duane S., Kennedy A.D., Pendleton B.J., and Roweth D. Hybrid Monte Carlo. Physics Letters B, 195(2): 216-222, 1987.
Elgammal A. and Lee C. Inferring 3d body pose from silhouettes using activity manifold learning. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Gavrila D. The Visual Analysis of Human Movement: A Survey. Computer Vision and Image Understanding, 73(1):82-98, 1999.
Gordon N., Salmond D., and Smith A. Novel Approach to Non-linear/Non-Gaussian State Estimation. IEE Proceedings F, 1993.
Howe N., Leventon M., and Freeman W. Bayesian Reconstruction of 3D Human Motion from Single-Camera Video. Advances in Neural Information Processing Systems, 1999.
Isard M. and Blake A. A Smoothing Filter for CONDENSATION. In European Conference on Computer Vision, 1998.
Isard M. and Blake A. CONDENSATION - Conditional Density Propagation for Visual Tracking. International Journal of Computer Vision, 1998.
Isard M. and Blake A. Icondensation: Unifying low-level and high-level tracking in a stochastic framework. In European Conference on Computer Vision, 1998.
Jordan M. Learning in Graphical Models. MIT Press, 1998.
Kakadiaris I. and Metaxas D. Model-Based Estimation of 3D Human Motion with Occlusion Prediction Based on Active Multi-Viewpoint Selection. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 81-87,1996.
Kehl R., Bray M., and Gool L.V. Full body tracking from multiple views using stochastic sampling. In IEEE International Conference on Computer Vision and Pattern Recognition, 2005.
Lan X. and Huttenlocher D. Beyond trees: common factor models for 2d human pose recovery. In IEEE International Conference on Computer Vision, 2005.
Lee H.J. and Chen Z.. Determination of 3D Human Body Postures from a Single View. Computer Vision, Graphics and Image Processing, 30:148-168, 1985.
Lee M. and Cohen I. Proposal maps driven mcmc for estimating human body pose in static images. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Li R., Yang M., Sclaroff S., and Tian T. Monocular Tracking of 3D Human Mo-tion with a Coordianted Mixture of Factor Analyzers. In European Conference on Computer Vision, 2006.
Mackay D. Bayesian Interpolation. Neural Computation, 4(5):720-736, 1992.
McCallum A., Freitag D., and Pereira F. Maximum entropy Markov models for information extraction and segmentation. In International Conference on Machine Learning, 2000.
Mori G. and Malik J. Estimating Human Body Configurations Using Shape Context Matching. In European Conference on Computer Vision, 2002.
Neal R. Annealed Importance Sampling. Statistics and Computing, 11:125-139, 2001.
Ramanan D. and Sminchisescu C. Training Deformable Models for Localization. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.
Rosales R. and Sclaroff S. Learning Body Pose Via Specialized Maps. In Ad- vances in Neural Information Processing Systems, 2002.
Roth S., Sigal L., and Black M. Gibbs Likelihoods for Bayesian Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Roweis S. and Saul L. Nonlinear Dimensionality Reduction by Locally Linear Embedding. Science, 2000.
Schölkopf B., Smola A. and Müller K. Nonlinear Component Analysis as a Kernel Eigenvalue Problem. Neural Computation, 10:1299-1319, 1998.
Shakhnarovich G., Viola P., and Darrell T. Fast Pose Estimation with Parameter Sensitive Hashing. In IEEE International Conference on Computer Vision, 2003.
Sidenbladh H. and Black M. Learning Image Statistics for Bayesian Tracking. In IEEE International Conference on Computer Vision, 2001.
Sidenbladh H., Black M., and Fleet D. Stochastic Tracking of 3D Human Figures Using 2D Image Motion. In European Conference on Computer Vision, 2000.
Sigal L., Bhatia S., Roth S., Black M., and Isard M. Tracking Loose-limbed People. In IEEE International Conference on Computer Vision and Pattern Recognition, 2004.
Sminchisescu C. Consistency and Coupling in Human Model Likelihoods. In IEEE International Conference on Automatic Face and Gesture Recognition, pages 27-32, Washington DC, 2002.
Sminchisescu C. and Jepson A. Density propagation for continuous temporal chains. Generative and discriminative models. Technical Report CSRG-401, University of Toronto, October 2004.
Sminchisescu C. and Jepson A. Generative modelling for Continuous Non-Linearly Embedded Visual Inference. In International Conference on Machine Learning, pp. 759-766, Banff, 2004.
Sminchisescu C. and Jepson A. Variational Mixture Smoothing for Non-Linear Dynamical Systems. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 2, pp. 608-615, Washington DC, 2004.
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Learning to reconstruct 3D human motion from Bayesian mixtures of experts. A probabilistic discriminative approach. Technical Report CSRG-502, University of Toronto, October, 2004.
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Conditional models for contextual human motion recognition. In IEEE International Conference on Computer Vision, vol 2, pp. 1808-1815, 2005.
Sminchisescu C., Kanaujia A., Li Z., and Metaxas D. Discriminative Density Propagation for 3D Human Motion Estimation. In IEEE International Confer-ence on Computer Vision and Pattern Recognition, vol 1, pp. 390-397, 2005.
Sminchisescu C., Kanaujia A. and Metaxas D. BM 3 E : Discriminative Density Propagation for Visual Tracking. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007.
Sminchisescu C., Kanaujia A., and Metaxas D. Learning Joint Top-down and Bottom-up Processes for 3D Visual Inference. In IEEE International Conference on Computer Vision and Pattern Recognition, 2006.
Sminchisescu C. and Triggs B. Building Roadmaps of Local Minima of Vi-sual Models. In European Conference on Computer Vision, vol 1, pp. 566-582, Copenhagen, 2002.
Sminchisescu C. and Triggs B. Hyperdynamics Importance Sampling. In Euro-pean Conference on Computer Vision, vol 1, pp. 769-783, Copenhagen, 2002.
Sminchisescu C. and Triggs B. Estimating Articulated Human Motion with Covariance Scaled Sampling. International Journal of Robotics Research, 22 (6):371-393, 2003.
Sminchisescu C. and Triggs B. Kinematic Jump Processes for Monocular 3D Human Tracking. In IEEE International Conference on Computer Vision and Pattern Recognition, vol 1, pp. 69-76, Madison, 2003.
Sminchisescu C. and Welling M. Generalized Darting Monte-Carlo. In 9th International Conference on Artificial Intelligence and Statistics, 2007.
Sudderth E., Ihler A., Freeman W., and Wilsky A. Non-parametric belief prop-agation. In IEEE International Conference on Computer Vision and Pattern Recognition, 2003.
Taylor C.J. Reconstruction of Articulated Objects from Point Correspondences in a Single Uncalibrated Image. In IEEE International Conference on Computer Vision and Pattern Recognition, pp. 677-684, 2000.
Tenenbaum J., Silva V., and Langford J. A Global Geometric Framewok for Nonlinear Dimensionality Reduction. Science, 2000.
.Tipping M. Sparse Bayesian learning and the Relevance Vector Machine. Jour- nal of Machine Learning Research, 2001.
Tomasi C., Petrov S., and Sastry A. 3d tracking = classification + interpolation. In IEEE International Conference on Computer Vision, 2003.
Urtasun R., Fleet D., Hertzmann A., and Fua P. Priors for people tracking in small training sets. In IEEE International Conference on Computer Vision, 2005.
Wachter S. and Nagel H. Tracking Persons in Monocular Image Sequences. Computer Vision and Image Understanding, 74(3):174-192, 1999.
Waterhouse S., Mackay D., and Robinson T. Bayesian Methods for Mixtures of Experts. In Advances in Neural Information Processing Systems, 1996.
Weston J., Chapelle O., Elisseeff A., Schölkopf B., and Vapnik V. Kernel De-pendency Estimation. In Advances in Neural Information Processing Systems, 2002.
Wipf D., Palmer J., and Rao B. Perspectives on Sparse Bayesian Learning. In Advances in Neural Information Processing Systems, 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer
About this chapter
Cite this chapter
Sminchisescu, C. (2008). 3D Human Motion Analysis in Monocular Video: Techniques and Challenges. In: Rosenhahn, B., Klette, R., Metaxas, D. (eds) Human Motion. Computational Imaging and Vision, vol 36. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6693-1_8
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6693-1_8
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6692-4
Online ISBN: 978-1-4020-6693-1
eBook Packages: Computer ScienceComputer Science (R0)