Skip to main content
Log in

A Unified Algebraic Approach to 2-D and 3-D Motion Segmentation and Estimation

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

In this paper, we present an analytic solution to the problem of estimating an unknown number of 2-D and 3-D motion models from two-view point correspondences or optical flow. The key to our approach is to view the estimation of multiple motion models as the estimation of a single multibody motion model. This is possible thanks to two important algebraic facts. First, we show that all the image measurements, regardless of their associated motion model, can be fit with a single real or complex polynomial. Second, we show that the parameters of the individual motion model associated with an image measurement can be obtained from the derivatives of the polynomial at that measurement. This leads to an algebraic motion segmentation and estimation algorithm that applies to most of the two-view motion models that have been adopted in computer vision. Our experiments show that the proposed algorithm out-performs existing algebraic and factorization-based methods in terms of efficiency and robustness, and provides a good initialization for iterative techniques, such as Expectation Maximization, whose performance strongly depends on good initialization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Avidan and A. Shashua, “Trajectory triangulation: 3D reconstruction of moving points from a monocular image sequence,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 22, No. 4, pp. 348–357, 2000.

    Article  Google Scholar 

  2. S. Ayer and H. Sawhney, “Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding,” in IEEE International Conference on Computer Vision, 1995, pp. 777–785.

  3. M. Black and P. Anandan, “Robust dynamic motion estimation over time,” in IEEE Conference on Computer Vision and Pattern Recognition, 1991, pp. 296–302.

  4. T.E. Boult and L.G. Brown, “Factorization-based segmentation of motions,” in Proc. of the IEEE Workshop on Motion Understanding, 1991, pp. 179–186.

  5. A. Chiuso, P. Favaro, H. Jin, and S. Soatto, “Motion and structure causally integrated over time,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, No. 4, pp. 523–535, 2002.

    Article  Google Scholar 

  6. T. Darrel and A. Pentland, “Robust estimation of a multi-layered motion representation,” in IEEE Workshop on Visual Motion, 1991, pp. 173–178.

  7. X. Feng and P. Perona, “Scene segmentation from 3D motion,” in IEEE Conference on Computer Vision and Pattern Recognition, 1998, pp. 225–231.

  8. M.A. Fischler and R. C. Bolles, “RANSAC random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, Vol. 26, pp. 381–395, 1981.

    Article  MathSciNet  Google Scholar 

  9. A. Fitzgibbon and A. Zisserman, “Multibody structure and motion: 3D reconstruction of independently moving objects,” in European Conference on Computer Vision, 2000, pp. 891–906.

  10. M. Han and T. Kanade, “Reconstruction of a scene with multiple linearly moving objects,” in IEEE Conference on Computer Vision and Pattern Recognition, Vol. 2, 2000, pp. 542–549.

  11. M. Han and T. Kanade, “Multiple motion scene reconstruction from uncalibrated views,” in IEEE International Conference on Computer Vision, Vol. 1, pp. 163–170, 2001.

  12. J. Harris, Algebraic Geometry: A First Course, Springer-Verlag, 1992.

  13. R. Hartley and R. Vidal, “The multibody trifocal tensor: Motion segmentation from 3 perspective views,” in IEEE Conference on Computer Vision and Pattern Recognition, Vol. I, pp. 769–775, 2004.

  14. R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd edition, Cambridge, 2004.

  15. A. Heyden and K. Åström, “Algebraic properties of multilinear constraints,” Mathematical Methods in Applied Sciences, Vol. 20, No. 13, pp. 1135–1162, 1997.

    Article  MATH  Google Scholar 

  16. M. Irani, B. Rousso, and S. Peleg, “Detecting and tracking multiple moving objects using temporal integration,” in European Conference on Computer Vision, 1992, pp. 282–287.

  17. A. Jepson and M. Black, “ Mixture models for optical flow computation,” in IEEE Conference on Computer Vision and Pattern Recognition, 1993, pp. 760–761.

  18. K. Kanatani, “Motion segmentation by subspace separation and model selection,” in IEEE International Conference on Computer Vision, 2001, Vol. 2, pp. 586–591.

  19. K. Kanatani, “Evaluation and selection of models for motion segmentation,” in Asian Conference on Computer Vision, 2002, pp. 7–12.

  20. K. Kanatani and C. Matsunaga, “Estimating the number of independent motions for multibody motion segmentation,” in European Conference on Computer Vision, 2002, pp. 25–31.

  21. K. Kanatani and Y. Sugaya, “Multi-stage optimization for multi-body motion segmentation,” in Australia-Japan Advanced Workshop on Computer Vision, 2003, pp. 335–349.

  22. Q. Ke and T. Kanade, “A robust subspace approach to layer extraction,” in IEEE Workshop on Motion and Video Computing, 2002, pp. 37–43.

  23. H. C. Longuet-Higgins, “A computer algorithm for reconstructing a scene from two projections,” Nature, Vol. 293, pp. 133–135, 1981

    Article  Google Scholar 

  24. Y. Ma, S. Soatto, J. Kosecka, and S. Sastry, An Invitation to 3D Vision: From Images to Geometric Models, Springer Verlag, 2003.

  25. O. Shakernia, R. Vidal, and S. Sastry, “Multi-body motion estimation and segmentation from multiple central panoramic views,” in IEEE International Conference on Robotics and Automation, 2003, Vol. 1, pp. 571–576.

  26. A. Shashua and A. Levin, “Multi-frame infinitesimal motion model for the reconstruction of (dynamic) scenes with multiple linearly moving objects,” in IEEE International Conference on Computer Vision, 2001, Vol. 2, pp. 592–599.

  27. J. Shi and J. Malik, “ Motion segmentation and tracking using normalized cuts,” in IEEE International Conference on Computer Vision, 1998, pp. 1154–1160.

  28. A. Spoerri and S. Ullman, “The early detection of motion boundaries,” in IEEE International Conference on Computer Vision, 1987, pp. 209–218.

  29. P. Sturm, “ Structure and motion for dynamic scenes - the case of points moving in planes,” in European Conference on Computer Vision, 2002, pp. 867–882.

  30. P. Torr, R. Szeliski, and P. Anandan, “An integrated Bayesian approach to layer extraction from image sequences,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 23, No. 3, pp. 297–303, 2001.

    Article  Google Scholar 

  31. P.H.S. Torr, “Geometric motion segmentation and model selection,” Phil. Trans. Royal Society of London, Vol. 356, No. 1740, No. 1321–1340, 1998.

  32. R. Vidal, Imaging Beyond the Pinhole Camera, chapter Segmentation of Dynamic Scenes Taken by a Central Panoramic Camera, LNCS. Springer Verlag, 2006.

  33. R. Vidal and R. Hartley, “Motion segmentation with missing data by PowerFactorization and Generalized PCA,” in IEEE Conference on Computer Vision and Pattern Recognition, 2004, Vol. 2, pp. 310–316.

  34. R. Vidal and Y. Ma, “A unified algebraic approach to 2-D and 3-D motion segmentation,” in European Conference on Computer Vision, 2004, pp. 1–15.

  35. R. Vidal, Y. Ma, and J. Piazzi, “ A new GPCA algorithm for clustering subspaces by fitting, differentiating and dividing polynomials,” in IEEE Conference on Computer Vision and Pattern Recognition, 2004, Vol. 1, pp. 510–517.

  36. R. Vidal, Y. Ma, and S. Sastry, “Generalized Principal Component Analysis (GPCA),” in IEEE Conference on Computer Vision and Pattern Recognition, Vol. I, pp. 621–628, 2003.

  37. R. Vidal, Y. Ma, and S. Sastry, “Generalized Principal Component Analysis (GPCA),” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 27, No. 12, pp. 1–15, 2005.

    Article  Google Scholar 

  38. R. Vidal, Y. Ma, S. Soatto, and S. Sastry, “Two-view multibody structure from motion,” International Journal of Computer Vision, Vol. 68, No. 1, 2006.

  39. R. Vidal and S. Sastry, “Optimal segmentation of dynamic scenes from two perspective views,” in IEEE Conference on Computer Vision and Pattern Recognition, 2003, Vol. 2, pp. 281–286.

  40. R. Vidal, S. Soatto, Y. Ma, and S. Sastry, “Segmentation of dynamic scenes from the multibody fundamental matrix,” in ECCV Workshop on Visual Modeling of Dynamic Scenes, 2002.

  41. J. Wang and E. Adelson, “Layered representation for motion analysis,” in IEEE Conference on Computer Vision and Pattern Recognition, 1993, pp. 361–366.

  42. Y. Weiss, “A unified mixture framework for motion segmentation: incoprporating spatial coherence and estimating the number of models,” in IEEE Conference on Computer Vision and Pattern Recognition, 1996, pp. 321–326.

  43. Y. Weiss, “Smoothness in layers: Motion segmentation using nonparametric mixture estimation,” in IEEE Conference on Computer Vision and Pattern Recognition, 1997, pp. 520–526.

  44. L. Wolf and A. Shashua, “Affine 3-D reconstruction from two projective images of independently translating planes,” in IEEE International Conference on Computer Vision, 2001, pp. 238–244.

  45. L. Wolf and A. Shashua, “Two-body segmentation from two perspective views,” in IEEE Conference on Computer Vision and Pattern Recognition, 2001, pp. 263–270.

  46. Y. Wu, Z. Zhang, T.S. Huang, and J.Y. Lin, “ Multibody grouping via orthogonal subspace decomposition,” in IEEE Conference on Computer Vision and Pattern Recognition, 2001, Vol 2, pp. 252–257.

  47. L. Zelnik-Manor and M. Irani, “Degeneracies, dependencies and their implications in multi-body and multi-sequence factorization,” in IEEE Conference on Computer Vision and Pattern Recognition, 2003, Vol. 2, pp. 287–293.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to René Vidal.

Additional information

This paper is an extended version of [34]. The authors thank Sampreet Niyogi for his help with the experimental section of the paper. This work was partially supported by Hopkins WSE startup funds, UIUC ECE startup funds, and by grants NSF CAREER IIS-0347456, NSF CAREER IIS-0447739, NSF CRS-EHS-0509151, NSF-EHS-0509101, NSF CCF-TF-0514955, ONR YIP N00014-05-1-0633 and ONR N00014-05-1-0836.

René Vidal received his B.S. degree in Electrical Engineering (highest honors) from the Universidad Católica de Chile in 1997 and his M.S. and Ph.D. degrees in Electrical Engineering and Computer Sciences from the University of California at Berkeley in 2000 and 2003, respectively. In 2004, he joined The Johns Hopkins University as an Assistant Professor in the Department of Biomedical Engineering and the Center for Imaging Science. He has co-authored more than 70 articles in biomedical imaging, computer vision, machine learning, hybrid systems, robotics, and vision-based control. Dr. Vidal is recipient of the 2005 NFS CAREER Award, the 2004 Best Paper Award Honorable Mention at the European Conference on Computer Vision, the 2004 Sakrison Memorial Prize, the 2003 Eli Jury Award, and the 1997 Award of the School of Engineering of the Universidad Católica de Chile to the best graduating student of the school.

Yi Ma received his two bachelors' degree in Automation and Applied Mathematics from Tsinghua University, Beijing, China in 1995. He received an M.S. degree in Electrical Engineering and Computer Science (EECS) in 1997, an M.A. degree in Mathematics in 2000, and a PhD degree in EECS in 2000 all from the University of California at Berkeley. Since August 2000, he has been on the faculty of the Electrical and Computer Engineering Department of the University of Illinois at Urbana-Champaign, where he is now an associate professor. In fall 2006, he is a visiting faculty at the Microsoft Research in Asia, Beijing, China. He has written more than 40 technical papers and is the first author of a book, entitled “An Invitation to 3-D Vision: From Images to Geometric Models,” published by Springer in 2003. Yi Ma was the recipient of the David Marr Best Paper Prize at the International Conference on Computer Vision in 1999 and Honorable Mention for the Longuet-Higgins Best Paper Award at the European Conference on Computer Vision in 2004. He received the CAREER Award from the National Science Foundation in 2004 and the Young Investigator Program Award from the Office of Naval Research in 2005.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vidal, R., Ma, Y. A Unified Algebraic Approach to 2-D and 3-D Motion Segmentation and Estimation. J Math Imaging Vis 25, 403–421 (2006). https://doi.org/10.1007/s10851-006-8286-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-006-8286-z

Keywords

Navigation