Object Segmentation by Long Term Analysis of Point Trajectories

  • Thomas Brox
  • Jitendra Malik
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6315)


Unsupervised learning requires a grouping step that defines which data belong together. A natural way of grouping in images is the segmentation of objects or parts of objects. While pure bottom-up segmentation from static cues is well known to be ambiguous at the object level, the story changes as soon as objects move. In this paper, we present a method that uses long term point trajectories based on dense optical flow. Defining pair-wise distances between these trajectories allows to cluster them, which results in temporally consistent segmentations of moving objects in a video shot. In contrast to multi-body factorization, points and even whole objects may appear or disappear during the shot. We provide a benchmark dataset and an evaluation method for this so far uncovered setting.


  1. 1.
    Spelke, E.: Principles of object perception. Cognitive Science 14, 29–56 (1990)CrossRefGoogle Scholar
  2. 2.
    Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: European Conf. on Computer Vision. LNCS, Springer, Heidelberg (2010)Google Scholar
  3. 3.
    Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence (to appear)Google Scholar
  4. 4.
    Sand, P., Teller, S.: Particle video: long-range motion estimation using point trajectories. International Journal of Computer Vision 80, 72–91 (2008)CrossRefGoogle Scholar
  5. 5.
    Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 888–905 (2000)CrossRefGoogle Scholar
  6. 6.
    Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems (2002)Google Scholar
  7. 7.
    Sivic, J., Schaffalitzky, F., Zisserman, A.: Object level grouping for video shots. International Journal of Computer Vision 67, 189–210 (2006)CrossRefGoogle Scholar
  8. 8.
    Tron, R., Vidal, R.: A benchmark for the comparison of 3-D motion segmentation algorithms. In: Int. Conf. on Computer Vision and Pattern Recognition (2007)Google Scholar
  9. 9.
    Koffka, K.: Principles of Gestalt Psychology. Hartcourt Brace Jovanovich, New York (1935)Google Scholar
  10. 10.
    Wang, J.Y.A., Adelson, E.H.: Representing moving images with layers. IEEE Transactions on Image Processing 3, 625–638 (1994)CrossRefGoogle Scholar
  11. 11.
    Weiss, Y.: Smoothness in layers: motion segmentation using nonparametric mixture estimation. In: Int. Conf. on Computer Vision and Pattern Recognition, pp. 520–527 (1997)Google Scholar
  12. 12.
    Shi, J., Malik, J.: Motion segmentation and tracking using normalized cuts. In: Proc. 6th International Conference on Computer Vision, Bombay, India, pp. 1154–1160 (1998)Google Scholar
  13. 13.
    Cremers, D., Soatto, S.: Motion competition: A variational framework for piecewise parametric motion segmentation. International Journal of Computer Vision 62, 249–265 (2005)CrossRefGoogle Scholar
  14. 14.
    Xiao, J., Shah, M.: Motion layer extraction in the presence of occlusion using graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 1644–1659 (2005)CrossRefGoogle Scholar
  15. 15.
    Pawan Kumar, M., Torr, P., Zisserman, A.: Learning layered motion segmentations of video. International Journal of Computer Vision 76, 301–319 (2008)CrossRefGoogle Scholar
  16. 16.
    Smith, P., Drummond, T., Cipolla, R.: Layered motion segmentation and depth ordering by tracking edges. IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 479–494 (2004)CrossRefGoogle Scholar
  17. 17.
    Costeira, J., Kanande, T.: A multi-body factorization method for motion analysis. In: Int. Conf. on Computer Vision, pp. 1071–1076 (1995)Google Scholar
  18. 18.
    Yan, J., Pollefeys, M.: A general framework for motion segmentation: independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 94–106. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Rao, S.R., Tron, R., Vidal, R., Ma, Y.: Motion segmentation via robust subspace separation in the presence of outlying, incomplete, or corrupted trajectories. In: Int. Conf. on Computer Vision and Pattern Recognition (2008)Google Scholar
  20. 20.
    Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: Int. Conf. on Computer Vision and Pattern Recognition (2009)Google Scholar
  21. 21.
    Brostow, G., Cipolla, R.: Unsupervised Bayesian detection of independent motion in crowds. In: Int. Conf. on Computer Vision and Pattern Recognition (2006)Google Scholar
  22. 22.
    Cheriyadat, A., Radke, R.: Non-negative matrix factorization of partial track data for motion segmentation. In: Int. Conf. on Computer Vision (2009)Google Scholar
  23. 23.
    Fradet, M., Robert, P., Pérez, P.: Clustering point trajectories with various life-spans. In: Proc. European Conference on Visual Media Production (2009)Google Scholar
  24. 24.
    Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  25. 25.
    Belongie, S., Malik, J.: Finding boundaries in natural images: A new method using point descriptors and area completion. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 751–766. Springer, Heidelberg (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Thomas Brox
    • 1
    • 2
  • Jitendra Malik
    • 1
  1. 1.University of California at Berkeley 
  2. 2.Albert-Ludwigs-University of FreiburgGermany

Personalised recommendations