Extracting Moving People from Internet Videos

Niebles, Juan Carlos; Han, Bohyung; Ferencz, Andras; Fei-Fei, Li

doi:10.1007/978-3-540-88693-8_39

Juan Carlos Niebles^4,5,
Bohyung Han⁶,
Andras Ferencz⁶ &
…
Li Fei-Fei⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5305))

Included in the following conference series:

European Conference on Computer Vision

9925 Accesses
14 Citations

Abstract

We propose a fully automatic framework to detect and extract arbitrary human motion volumes from real-world videos collected from YouTube. Our system is composed of two stages. A person detector is first applied to provide crude information about the possible locations of humans. Then a constrained clustering algorithm groups the detections and rejects false positives based on the appearance similarity and spatio-temporal coherence. In the second stage, we apply a top-down pictorial structure model to complete the extraction of the humans in arbitrary motion. During this procedure, a density propagation technique based on a mixture of Gaussians is employed to propagate temporal information in a principled way. This method reduces greatly the search space for the measurement in the inference stage. We demonstrate the initial success of this framework both quantitatively and qualitatively by using a number of YouTube videos.

Download to read the full chapter text

Chapter PDF

Tracking People in Video Sequences by Clustering Feature Motion Paths

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Laptev, I.: Improvements of object detection using boosted histograms. In: BMVC, Edinburgh, UK, vol. III, pp. 949–958 (2006)
Google Scholar
Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: ICML (2002)
Google Scholar
Felzenszwalb, P., Huttenlocher, D.: Pictorial structures for object recognition. IJCV 61, 55–79 (2005)
Article Google Scholar
Ramanan, D.: Learning to parse images of articulated objects. In: NIPS, Vancouver, Canada (2006)
Google Scholar
Ramanan, D., Forsyth, D., Zisserman, A.: Tracking people by learning their appearance. PAMI 29, 65–81 (2007)
Google Scholar
Lucas, B., Kanade, T.: An iterative image registration technique with an application to stereo vision. IJCAI, 674–679 (1981)
Google Scholar
Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: CVPR, Hilton Head, SC, vol. II, pp. 142–149 (2000)
Google Scholar
Cham, T., Rehg, J.: A multiple hypothesis approach to figure tracking. In: CVPR, Fort Collins, CO, vol. II, pp. 219–239 (1999)
Google Scholar
Deutscher, J., Blake, A., Reid, I.: Articulated body motion capture by annealed particle filtering. In: CVPR, Hilton Head, SC (2000)
Google Scholar
Han, T.X., Ning, H., Huang, T.S.: Efficient nonparametric belief propagation with application to articulated body tracking. In: CVPR, New York, NY (2006)
Google Scholar
Haritaoglu, I., Harwood, D., Davis, L.: W4: Who? When? Where? What? - A real time system for detecting and tracking people. In: Proc. of Intl. Conf. on Automatic Face and Gesture Recognition, Nara, Japan, pp. 222–227 (1998)
Google Scholar
Lee, C.S., Elgammal, A.: Modeling view and posture manifolds for tracking. In: ICCV, Rio de Janeiro, Brazil (2007)
Google Scholar
Sigal, L., Bhatia, S., Roth, S., Black, M., Isard, M.: Tracking loose-limbed people. In: CVPR, Washington DC, vol. I, pp. 421–428 (2004)
Google Scholar
Sminchisescu, C., Triggs, B.: Covariance scaled sampling for monocular 3D body tracking. In: CVPR, Kauai, Hawaii, vol. I, pp. 447–454 (2001)
Google Scholar
Sminchisescu, C., Kanaujia, A., Li, Z., Metaxas, D.: Discriminative density propagation for 3d human motion estimation. In: CVPR, San Diego, CA, vol. I, pp. 390–397 (2005)
Google Scholar
Leibe, B., Seemann, E., Schiele, B.: Pedestrian detection in crowded scenes. In: CVPR, San Diego, CA, vol. I, pp. 878–885 (2005)
Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: CVPR, San Diego, CA, vol. I, pp. 886–893 (2005)
Google Scholar
Tuzel, O., Porikli, F., Meer, P.: Human detection via classification on riemannian manifolds. In: CVPR, Minneapolis, MN (2007)
Google Scholar
Viola, P., Jones, M.J., Snow, D.: Detecting pedestrians using patterns of motion and appearance. In: ICCV, Nice, France, pp. 734–741 (2003)
Google Scholar
Wu, B., Nevatia, R.: Detection of multiple, partially occluded humans in a single image by bayesian combination of edgelet part detectors. In: ICCV, Beijing, China, vol. I, pp. 90–97 (2005)
Google Scholar
Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR, Anchorage, AK (2008)
Google Scholar
Ren, X., Malik, J.: Tracking as repeated figure/ground segmentation. In: CVPR, Minneapolis, MN (2007)
Google Scholar
Arulampalam, S., Maskell, S., Gordon, N., Clapp, T.: A tutorial on particle filters for on-line non-linear/non-gaussian bayesian tracking. IEEE Trans. Signal Process. 50, 174–188 (2002)
Article Google Scholar
Doucet, A., de Freitas, N., Gordon, N.: Sequential Monte Carlo Methods in Practice. Springer, Heidelberg (2001)
MATH Google Scholar
Han, B., Zhu, Y., Comaniciu, D., Davis, L.: Kernel-based bayesian filtering for object tracking. In: CVPR, San Diego, CA, vol. I, pp. 227–234 (2005)
Google Scholar
Han, B., Comaniciu, D., Zhu, Y., Davis, L.: Sequential kernel density approximation and its application to real-time visual tracking. PAMI 30, 1186–1197 (2008)
Google Scholar
Lienhart, R.: Reliable transition detection in videos: A survey and practitioner’s guide. International Journal of Image and Graphics 1, 469–486 (2001)
Article Google Scholar
Van Rijsbergen, C.J.: Information Retreival. Butterworths, London (1979)
Google Scholar
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: ICCV, Beijing, China, pp. 1395–1402 (2005)
Google Scholar
Ke, Y., Sukthankar, R., Hebert, M.: Efficient visual event detection using volumetric features. In: ICCV, Beijing, China, pp. 166–173 (2005)
Google Scholar
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial-temporal words. IJCV 79, 299–318 (2008)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Princeton University, Princeton, NJ, USA
Juan Carlos Niebles & Li Fei-Fei
Universidad del Norte, Colombia
Juan Carlos Niebles
Mobileye Vision Technologies, Princeton, NJ, USA
Bohyung Han & Andras Ferencz

Authors

Juan Carlos Niebles
View author publications
You can also search for this author in PubMed Google Scholar
Bohyung Han
View author publications
You can also search for this author in PubMed Google Scholar
Andras Ferencz
View author publications
You can also search for this author in PubMed Google Scholar
Li Fei-Fei
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, University of Illinois at Urbana Champaign, 3310 Siebel Hall, IL 61801, Urbana, USA
David Forsyth
Department of Computing, Wheatley, Oxford Brookes University, OX33 1HX, Oxford, UK
Philip Torr
Department of Engineering Science, University of Oxford, Parks Road, OX1 3PJ, Oxford, UK
Andrew Zisserman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Niebles, J.C., Han, B., Ferencz, A., Fei-Fei, L. (2008). Extracting Moving People from Internet Videos. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88693-8_39

Download citation

DOI: https://doi.org/10.1007/978-3-540-88693-8_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88692-1
Online ISBN: 978-3-540-88693-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Extracting Moving People from Internet Videos

Abstract

Chapter PDF

Similar content being viewed by others

Tracking People in Video Sequences by Clustering Feature Motion Paths

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Extracting Moving People from Internet Videos

Abstract

Chapter PDF

Similar content being viewed by others

Tracking People in Video Sequences by Clustering Feature Motion Paths

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation