Skip to main content

A Dynamic Programming Approach to Maximizing Tracks for Structure from Motion

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5995))

Abstract

We present a novel algorithm for improving the accuracy of structure from motion on video sequences. Its goal is to efficiently recover scene structure and camera pose by using dynamic programming to maximize the lengths of putative keypoint tracks. By efficiently discarding poor correspondences while maintaining the largest possible set of inliers, it ultimately provides a robust and accurate scene reconstruction. Traditional outlier detection strategies, such as RANSAC and its derivatives, cannot handle high dimensional problems such as structure from motion over long image sequences. We prove that, given an estimate of the camera pose at a given frame, the outlier detection is optimal and runs in low order polynomial time. The algorithm is applied on-line, processing each frame in sequential order. Results are presented on several indoor and outdoor video sequences processed both with and without the proposed optimization. The improvement in average reprojection errors demonstrates its effectiveness.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge University Press, New York (2003)

    Google Scholar 

  2. Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. International Journal of Computer Vision 59(3), 207–232 (2004)

    Article  Google Scholar 

  3. Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3d. In: SIGGRAPH 2006: ACM SIGGRAPH 2006 Papers, pp. 835–846 (2006)

    Google Scholar 

  4. Fitzgibbon, A.W., Zisserman, A.: Automatic camera recovery for closed or open image sequences. In: Burkhardt, H.-J., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1406, pp. 311–326. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  5. Meltzer, J., Soatto, S.: Edge descriptors for robust wide-baseline correspondence (2008)

    Google Scholar 

  6. Shi, J., Tomasi, C.: Good features to track. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600 (1994)

    Google Scholar 

  7. Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artificial Intelligence, pp. 674–679 (1981)

    Google Scholar 

  8. Bouguet, J.Y.: Pyramidal implementation of the Lucas-Kanade feature tracker. OpenCV library (2001), http://sourceforge.net/projects/opencvlibrary

  9. Mooser, J., Wang, Q., You, S., Neumann, U.: Fast simultaneous tracking and recognition by incremental keypoint matching. In: 3D Data Processing, Visualization and Transmission (2008)

    Google Scholar 

  10. Kolmogorov, V., Zabih, R.: Multi-camera scene reconstruction via graph cuts. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2352, pp. 82–96. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Gallup, D., Frahm, J.M., Mordohai, P., Yang, Q., Pollefeys, M.: Real-time plane-sweeping stereo with multiple sweeping directions, pp. 1–8 (2007)

    Google Scholar 

  12. Zhu, Z., Oskiper, T., Samarasekera, S., Kumar, R., Sawhney, H.: Real-time global localization with a pre-built visual landmark database (2008)

    Google Scholar 

  13. Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)

    Article  MathSciNet  Google Scholar 

  14. Torr, P.H.S., Zisserman, A.: Mlesac: a new robust estimator with application to estimating image geometry. Computer Vision and Image Understanding 78(1), 138–156 (2000)

    Article  Google Scholar 

  15. Nistér, D.: Preemptive ransac for live structure and motion estimation. In: IEEE International Conference on Computer Vision, p. 199 (2003)

    Google Scholar 

  16. Buchanan, A., Fitzgibbon, A.: Interactive feature tracking using k-d trees and dynamic programming. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 626–633 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mooser, J., You, S., Neumann, U., Grasset, R., Billinghurst, M. (2010). A Dynamic Programming Approach to Maximizing Tracks for Structure from Motion. In: Zha, H., Taniguchi, Ri., Maybank, S. (eds) Computer Vision – ACCV 2009. ACCV 2009. Lecture Notes in Computer Science, vol 5995. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12304-7_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12304-7_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12303-0

  • Online ISBN: 978-3-642-12304-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics