Abstract
In this work, we introduce a model-based approach to extracting the silhouette of people in motion from stereo video sequences. To this end, we extend a purely stereo-based approach to tracking people proposed in earlier work. This approach is based on an implicit surface model of the body. It lets us accurately predict the silhouette’s location and, therefore, detect them more robustly. In turn these silhouettes allow us to fit the model more precisely. This allows effective motion recovery, even when people are filmed against a cluttered unknown background. This is in contrast to many recent approaches that require silhouette contours to be readily obtainable using relatively simple methods, such as background subtraction, that typically require either engineering the scene or making strong assumptions.
We demonstrate our approach’s effectiveness using complex and fully three-dimensional motion sequences where the ability to combine stereo and silhouette information is key to obtaining good results.
This work was supported in part by the Swiss Federal Office for Education and Science
Chapter PDF
Similar content being viewed by others
References
J.K. Aggarwal and Q. Cai. Human motion analysis: a review. Computer Vision and Image Understanding, 73(3):428–440, 1999.
J. F. Blinn. A Generalization of Algebraic Surface Drawing. ACM Transactions on Graphics, 1(3):235–256, 1982.
Ch. Bregler and J. Malik. Tracking People with Twists and Exponential Maps. In Conference on Computer Vision and Pattern Recognition, Santa Barbara, CA, June 1998.
J. Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 1986.
J.J. Craig. Introduction to robotics: mechanics and control, chapter 5. Electrical and Computer Engineering. Addison-Wesley, 2nd edition, 1989.
L. Davis, E. Borovikov, R. Cutler, D. Harwood, and T. Horprasert. Multiperspective analysis of human action. In Third International Workshop on Cooperative Distributed Vision, November 1999.
Q. Delamarre and O. Faugeras. 3D Articulated Models and Multi-View Tracking with Silhouettes. In International Conference on Computer Vision, Corfu, Greece, September 1999.
J. Deutscher, A. Blake, and I. Reid. Articulated Body Motion Capture by Annealed Particle Filtering. In CVPR, Hilton Head Island, SC, 2000.
T. Drummond and R. Cipolla. Real-time tracking of highly articulated structures in the presence of noisy measurements. In International Conference on Computer Vision, Vancouver, Canada, July 2001.
P. Fua. From Multiple Stereo Views to Multiple 3-D Surfaces. International Journal of Computer Vision, 24(1):19–35, August 1997.
D.M. Gavrila. The Visual Analysis of Human Movement: A Survey. Computer Vision and Image Understanding, 73(1), January 1999.
I. Kakadiaris and D. Metaxas. 3D Human Body Model Acquisition from Multiple Views. In International Conference on Computer Vision, 1995.
T.B. Moeslund and E. Granum. A Survey of Computer Vision-Based Human Motion Capture. Computer Vision and Image Understanding, 81(3), March 2001.
R. Plänkers and P. Fua. Articulated Soft Objects for Video-based Body Modeling. In International Conference on Computer Vision, pages 394–401, Vancouver, Canada, July 2001.
W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. Numerical Recipes, the Art of Scientific Computing. Cambridge U. Press, Cambridge, MA, 1986.
S. Sullivan, L. Sandford, and J. Ponce. Using geometric distance fits for 3-d. object modeling and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16(12):1183–1196, December 1994.
D. Thalmann, J. Shen, and E. Chauvineau. Fast Realistic Human Body Deformations for Animation and VR Applications. In Computer Graphics International, Pohang, Korea, June 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Plaenkers, R., Fua, P. (2002). Model-Based Silhouette Extraction for Accurate People Tracking. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds) Computer Vision — ECCV 2002. ECCV 2002. Lecture Notes in Computer Science, vol 2351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47967-8_22
Download citation
DOI: https://doi.org/10.1007/3-540-47967-8_22
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43744-4
Online ISBN: 978-3-540-47967-3
eBook Packages: Springer Book Archive