Abstract
We propose a fully automatic framework to detect and extract arbitrary human motion volumes from real-world videos collected from YouTube. Our system is composed of two stages. A person detector is first applied to provide crude information about the possible locations of humans. Then a constrained clustering algorithm groups the detections and rejects false positives based on the appearance similarity and spatio-temporal coherence. In the second stage, we apply a top-down pictorial structure model to complete the extraction of the humans in arbitrary motion. During this procedure, a density propagation technique based on a mixture of Gaussians is employed to propagate temporal information in a principled way. This method reduces greatly the search space for the measurement in the inference stage. We demonstrate the initial success of this framework both quantitatively and qualitatively by using a number of YouTube videos.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Laptev, I.: Improvements of object detection using boosted histograms. In: BMVC, Edinburgh, UK, vol. III, pp. 949–958 (2006)
Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: ICML (2002)
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61, 55–79 (2005)
Ramanan, D.: Learning to parse images of articulated objects. In: NIPS, Vancouver, Canada (2006)
Ramanan, D., Forsyth, D., Zisserman, A.: Tracking people by learning their appearance. PAMI 29, 65–81 (2007)
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. IJCAI, 674–679 (1981)
Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: CVPR, Hilton Head, SC, vol. II, pp. 142–149 (2000)
Cham, T., Rehg, J.: A multiple hypothesis approach to figure tracking. In: CVPR, Fort Collins, CO, vol. II, pp. 219–239 (1999)
Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: CVPR, Hilton Head, SC (2000)
Han, T.X., Ning, H., Huang, T.S.: Efficient nonparametric belief propagation with application to articulated body tracking. In: CVPR, New York, NY (2006)
Haritaoglu, I., Harwood, D., Davis, L.: W4: Who? When? Where? What? - A real time system for detecting and tracking people. In: Proc. of Intl. Conf. on Automatic Face and Gesture Recognition, Nara, Japan, pp. 222–227 (1998)
Lee, C.S., Elgammal, A.: Modeling view and posture manifolds for tracking. In: ICCV, Rio de Janeiro, Brazil (2007)
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: CVPR, Washington DC, vol. I, pp. 421–428 (2004)
Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: CVPR, Kauai, Hawaii, vol. I, pp. 447–454 (2001)
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR, San Diego, CA, vol. I, pp. 390–397 (2005)
Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: CVPR, San Diego, CA, vol. I, pp. 878–885 (2005)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, San Diego, CA, vol. I, pp. 886–893 (2005)
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on riemannian manifolds. In: CVPR, Minneapolis, MN (2007)
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: ICCV, Nice, France, pp. 734–741 (2003)
Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: ICCV, Beijing, China, vol. I, pp. 90–97 (2005)
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR, Anchorage, AK (2008)
Ren, X., Malik, J.: Tracking as repeated figure/ground segmentation. In: CVPR, Minneapolis, MN (2007)
Arulampalam, S., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for on-line non-linear/non-gaussian bayesian tracking. IEEE Trans. Signal Process. 50, 174–188 (2002)
Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, Heidelberg (2001)
Han, B., Zhu, Y., Comaniciu, D., Davis, L.: Kernel-based bayesian filtering for object tracking. In: CVPR, San Diego, CA, vol. I, pp. 227–234 (2005)
Han, B., Comaniciu, D., Zhu, Y., Davis, L.: Sequential kernel density approximation and its application to real-time visual tracking. PAMI 30, 1186–1197 (2008)
Lienhart, R.: Reliable transition detection in videos: A survey and practitioner’s guide. International Journal of Image and Graphics 1, 469–486 (2001)
Van Rijsbergen, C.J.: Information Retreival. Butterworths, London (1979)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, Beijing, China, pp. 1395–1402 (2005)
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, Beijing, China, pp. 166–173 (2005)
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. IJCV 79, 299–318 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Niebles, J.C., Han, B., Ferencz, A., Fei-Fei, L. (2008). Extracting Moving People from Internet Videos. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88693-8_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-88693-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88692-1
Online ISBN: 978-3-540-88693-8
eBook Packages: Computer ScienceComputer Science (R0)