Abstract
A novel algorithm is presented for the 3D reconstruction of human action in long (>30 second) monocular image sequences. A sequence is represented by a small set of automatically found representative keyframes. The skeletal joint positions are manually located in each keyframe and mapped to all other frames in the sequence. For each keyframe a 3D key pose is created, and interpolation between these 3D body poses, together with the incorporation of limb length and symmetry constraints, provides a smooth initial approximation of the 3D motion. This is then fitted to the image data to generate a realistic 3D reconstruction. The degree of manual input required is controlled by the diversity of the sequence’s content. Sports’ footage is ideally suited to this approach as it frequently contains a limited number of repeated actions. Our method is demonstrated on a long (36 second) sequence of a woman playing tennis filmed with a non-stationary camera. This sequence required manual initialisation on <1.5% of the frames, and demonstrates that the system can deal with very rapid motion, severe self-occlusions, motion blur and clutter occurring over several concurrent frames. The monocular 3D reconstruction is verified by synthesising a view from the perspective of a ‘ground truth’ reference camera, and the result is seen to provide a qualitatively accurate 3D reconstruction of the motion.
Chapter PDF
References
Blake, A., Isard, M.: Active Contours. Springer, Heidelberg (1998)
Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: CVPR (1998)
Deutscher, J., Blake, A., Reid, I.: Motion capture by annealed particle filtering. In: Proc. Conf. Computer Vision and Pattern Recognition (2000)
Lepetit, V., Shahrokni, A., Fua, P.: Robust data association for anline applications. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Loy, G., Zelinsky, A.: Fast radial symmetry for detecting points of interest. IEEE Trans. on Pattern Analysis and Machine Intelligence 25(8), 959–973 (2003)
Mori, G., Malik, J.: Estimating human body configurations using shape context matching. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 666–680. Springer, Heidelberg (2002)
Moselund, T., Granum, E.: A survey of computer vision-based human motion capture. Computer Vision and Image Understanding 81(3) (2001)
Ramanan, D., Forsyth, D.: Finding and tracking people from the bottom up. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Shoemake, K.: Animating rotation with quaternion curves. In: SIGGRAPH (1985)
Sidenbladh, H., Black, M.: Implicit probabilistic models of human motion for synthesis and human tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 784–800. Springer, Heidelberg (2002)
Sidenbladh, H., Black, M., Fleet, D.J.: Stochastic tracking of 3d human figures using 2d image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)
Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3d body tracking. In: Proc. Conf. Computer Vision and Pattern Recognition (2001)
Sminchisescu, C., Triggs, B.: Kinematic jump processes for monocular 3d human tracking. In: Proc. Conf. Computer Vision and Pattern Recognition (2003)
Sullivan, J., Carlsson, S.: Recognizing and tracking human action. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 629–644. Springer, Heidelberg (2002)
Taylor, C.J.: Reconstruction of articulated objects from point correspondences in a single image. Computer Vision and Image Understanding 80(3), 349–363 (2000)
Toyama, K., Blake, A.: Probabilistic tracking in a metric space. In: ICCV (July 2001)
Zelnik-Manor, L., Irani, M.: Event-based video analysis. In: Proc. Conf. Computer Vision and Pattern Recognition (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Loy, G., Eriksson, M., Sullivan, J., Carlsson, S. (2004). Monocular 3D Reconstruction of Human Motion in Long Action Sequences. In: Pajdla, T., Matas, J. (eds) Computer Vision - ECCV 2004. ECCV 2004. Lecture Notes in Computer Science, vol 3024. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24673-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-540-24673-2_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21981-1
Online ISBN: 978-3-540-24673-2
eBook Packages: Springer Book Archive