Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

  • Chris Russell
  • Rui Yu
  • Lourdes Agapito
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8695)


Consider a video sequence captured by a single camera observing a complex dynamic scene containing an unknown mixture of multiple moving and possibly deforming objects. In this paper we propose an unsupervised approach to the challenging problem of simultaneously segmenting the scene into its constituent objects and reconstructing a 3D model of the scene. The strength of our approach comes from the ability to deal with real-world dynamic scenes and to handle seamlessly different types of motion: rigid, articulated and non-rigid. We formulate the problem as hierarchical graph-cut based segmentation where we decompose the whole scene into background and foreground objects and model the complex motion of non-rigid or articulated objects as a set of overlapping rigid parts. We evaluate the motion segmentation functionality of our approach on the Berkeley Motion Segmentation Dataset. In addition, to validate the capability of our approach to deal with real-world scenes we provide 3D reconstructions of some challenging videos from the YouTube-Objects dataset.


Minimum Description Length Dynamic Scene Bundle Adjustment Fundamental Matrice Motion Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Adams, A., Baek, J., Davis, A.: Fast high-dimensional filtering using the permutohedral lattice. In: Eurographics (2010)Google Scholar
  2. 2.
    Bleyer, M., Rother, C., Kohli, P.: Surface stereo with soft segmentation. In: CVPR (2010)Google Scholar
  3. 3.
    Boros, E., Hammer, P.L.: Pseudo-boolean optimization. Discrete Applied Mathematics, 155–225 (2002)Google Scholar
  4. 4.
    Boykov, Y., Kolmogorov, V.: An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. PAMI 26(9), 1124–1137 (2004)CrossRefGoogle Scholar
  5. 5.
    Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23 (2001)Google Scholar
  6. 6.
    Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Costeira, J., Kanade, T.: A multi-body factorization method for motion analysis. In: ICCV, pp. 1071–1076 (1995)Google Scholar
  8. 8.
    Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: CVPR (2009)Google Scholar
  9. 9.
    Fayad, J., Russell, C., Agapito, L.: Automated articulated structure and 3D shape recovery from point correspondences. In: IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain (November 2011)Google Scholar
  10. 10.
    Fitzgibbon, A.W., Zisserman, A.: Multibody structure and motion: 3-D reconstruction of independently moving objects. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 891–906. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  11. 11.
    Galasso, F., Cipolla, R., Schiele, B.: Video segmentation with superpixels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 760–774. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  12. 12.
    Garg, R., Roussos, A., Agapito, L.: Dense variational reconstruction of non-rigid surfaces from monocular video. In: CVPR (2013)Google Scholar
  13. 13.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press (2000)Google Scholar
  14. 14.
    Isack, H., Boykov, Y.: Energy-based geometric multi-model fitting. International Journal of Computer Vision (IJCV) 97(2) (2012)Google Scholar
  15. 15.
    Kanatani, K.: Motion segmentation by subspace separation and model selection. In: ICCV, Vancouver, Canada, vol. 2, pp. 301–306 (July 2001)Google Scholar
  16. 16.
    Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. In: CVPR (2008)Google Scholar
  17. 17.
    Ladickỳ, L., Russell, C., Kohli, P., Torr, P.H.: Inference methods for crfs with co-occurrence statistics. International Journal of Computer Vision 103(2), 213–225 (2013)CrossRefMathSciNetGoogle Scholar
  18. 18.
    Li, Z., Guo, J., Cheong, L.-F., Zhou, Z.: Perspective motion segmentation via collaborative clustering. In: ICCV (2013)Google Scholar
  19. 19.
    Lourakis, M.A., Argyros, A.: SBA: A Software Package for Generic Sparse Bundle Adjustment. ACM Trans. Math. Software (2009)Google Scholar
  20. 20.
    Narasimhan, M., Bilmes, J.A.: A submodular-supermodular procedure with applications to discriminative structure learning. arXiv preprint arXiv:1207.1404 (2012)Google Scholar
  21. 21.
    Ozden, K., Schindler, K., van Gool, L.: Multibody structure-from-motion in practice. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2010)Google Scholar
  22. 22.
    Paladini, M., Del Bue, A., Xavier, J., Agapito, L., Stosic, M., Dodig, M.: Factorization for Non-Rigid and Articulated Structure using Metric Projections. IJCV (2012)Google Scholar
  23. 23.
    Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR (2012)Google Scholar
  24. 24.
    Rao, S., Tron, R., Vidal, R., Ma, Y.: Motion segmentation in the presence of outlying, incomplete or corrupted trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(10), 1832–1845 (2010)CrossRefGoogle Scholar
  25. 25.
    Roussos, A., Russell, C., Garg, R., Agapito, L.: Dense multibody motion estimation and reconstruction from a handheld camera. In: ISMAR (2012)Google Scholar
  26. 26.
    Russell, C., Fayad, J., Agapito, L.: Energy based multiple model fitting for non-rigid structure from motion. In: CVPR (2011)Google Scholar
  27. 27.
    Schindler, K., Suter, D., Wang, H.: A model selection framework for multibody structure-and-motion of image sequences. International Journal of Computer Vision (IJCV) 79(2), 159–177 (2008)CrossRefGoogle Scholar
  28. 28.
    Siva, P., Russell, C., Xiang, T., Agapito, L.: Looking beyond the image: Unsupervised learning for object saliency and detection. In: CVPR (2013)Google Scholar
  29. 29.
    Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 438–451. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  30. 30.
    Tomasi, C., Kanade, T.: Shape and motion from image streams: a factorization method - part 3 detection and tracking of point features. Technical Report CMU-CS-91-132, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA (April 1991)Google Scholar
  31. 31.
    Torresani, L., Hertzmann, A., Bregler, C.: Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. PAMI, 878–892 (2008)Google Scholar
  32. 32.
    Tresadern, P., Reid, I.: Articulated structure from motion by factorization. In: CVPR, vol. 2, pp. 1110–1115 (June 2005)Google Scholar
  33. 33.
    Varol, A., Salzmann, M., Tola, E., Fua, P.: Template-free monocular reconstruction of deformable surfaces. In: ICCV (2009)Google Scholar
  34. 34.
    Vidal, R., Ma, Y., Sastry, S.: Generalized principal component analysis (gpca). In: CVPR, pp. 621–628 (2003)Google Scholar
  35. 35.
    Xu, C., Corso, J.J.: Evaluation of super-voxel methods for early video processing. In: CVPR (2012)Google Scholar
  36. 36.
    Yan, J., Pollefeys, M.: A factorization-based approach for articulated non-rigid shape, motion and kinematic chain recovery from video. PAMI 30(5) (May 2008)Google Scholar
  37. 37.
    Yuille, A.L., Rangarajan, A.: The concave-convex procedure (cccp). In: NIPS (2002)Google Scholar
  38. 38.
    Zelnik-Manor, L., Irani, M.: Degeneracies, dependencies and their implications in multi-body and multi-sequence factorizations. In: CVPR, vol. 2, pp. 287–293 (June 2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Chris Russell
    • 1
  • Rui Yu
    • 1
  • Lourdes Agapito
    • 1
  1. 1.University College LondonUK

Personalised recommendations