Efficient Structure from Motion by Graph Optimization

  • Michal Havlena
  • Akihiko Torii
  • Tomáš Pajdla
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6312)


We present an efficient structure from motion algorithm that can deal with large image collections in a fraction of time and effort of previous approaches while providing comparable quality of the scene and camera reconstruction. First, we employ fast image indexing using large image vocabularies to measure visual overlap of images without running actual image matching. Then, we select a small subset from the set of input images by computing its approximate minimal connected dominating set by a fast polynomial algorithm. Finally, we use task prioritization to avoid spending too much time in a few difficult matching problems instead of exploring other easier options. Thus we avoid wasting time on image pairs with low chance of success and avoid matching of highly redundant images of landmarks. We present results for several challenging sets of thousands of perspective as well as omnidirectional images.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

978-3-642-15552-9_8_MOESM1_ESM.gif (501 kb)
Electronic Supplementary Material (501 KB)
978-3-642-15552-9_8_MOESM2_ESM.gif (482 kb)
Electronic Supplementary Material (501 KB)
978-3-642-15552-9_8_MOESM3_ESM.gif (351 kb)
Electronic Supplementary Material (352 KB)
978-3-642-15552-9_8_MOESM4_ESM.gif (269 kb)
Electronic Supplementary Material (270 KB)
978-3-642-15552-9_8_MOESM5_ESM.gif (587 kb)
Electronic Supplementary Material (588 KB)
978-3-642-15552-9_8_MOESM6_ESM.gif (67 kb)
Electronic Supplementary Material (68 KB)
978-3-642-15552-9_8_MOESM7_ESM.gif (65 kb)
Electronic Supplementary Material (66 KB)
978-3-642-15552-9_8_MOESM8_ESM.gif (54 kb)
Electronic Supplementary Material (55 KB)
978-3-642-15552-9_8_MOESM9_ESM.gif (144 kb)
Electronic Supplementary Material (144 KB)
978-3-642-15552-9_8_MOESM10_ESM.gif (225 kb)
Electronic Supplementary Material (266 KB)


  1. 1.
    Schaffalitzky, F., Zisserman, A.: Multi-view matching for unordered image sets, or How Do I Organize My Holiday Snaps? In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2350, pp. 414–431. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  2. 2.
    Brown, M., Lowe, D.: Unsupervised 3D object recognition and reconstruction in unordered datasets. In: 3-D Digital Imaging and Modeling (3DIM), pp. 56–63 (2005)Google Scholar
  3. 3.
    Vergauwen, M., Van Gool, L.: Web-based 3D reconstruction service. Machine Vision and Applications (MVA) 17, 411–426 (2006)CrossRefGoogle Scholar
  4. 4.
    Martinec, D., Pajdla, T.: Robust rotation and translation estimation in multiview reconstruction. In: CVPR 2007 (2007)Google Scholar
  5. 5.
    Snavely, N., Seitz, S., Szeliski, R.: Modeling the world from internet photo collections. IJCV 80, 189–210 (2008)CrossRefGoogle Scholar
  6. 6.
    Snavely, N., Seitz, S., Szeliski, R.: Skeletal graphs for efficient structure from motion. In: CVPR 2008 (2008)Google Scholar
  7. 7.
    Agarwal, S., Snavely, N., Simon, I., Seitz, S., Szeliski, R.: Building Rome in a day. In: ICCV 2009, pp. 72–79 (2009)Google Scholar
  8. 8.
    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: CVPR 2006, vol. II, pp. 2161–2168 (2006)Google Scholar
  9. 9.
    Sivic, J., Zisserman, A.: Video Google: Efficient visual search of videos. In: Toward Category-Level Object Recognition (CLOR), pp. 127–144 (2006)Google Scholar
  10. 10.
    Li, X., Wu, C., Zach, C., Lazebnik, S., Frahm, J.: Modeling and recognition of landmark image collections using iconic scene graphs. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 427–440. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  11. 11.
    Oliva, A., Torralba, A.: Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV 42, 145–175 (2001)zbMATHCrossRefGoogle Scholar
  12. 12.
    Chum, O., Philbin, J., Isard, M., Zisserman, A.: Scalable near identical image and shot detection. In: Conference on Image and Video Retrieval (CIVR), pp. 549–556 (2007)Google Scholar
  13. 13.
    Guha, S., Khuller, S.: Approximation algorithms for connected dominating sets. Algorithmica 20, 374–387 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Havlena, M., Torii, A., Knopp, J., Pajdla, T.: Randomized structure from motion based on atomic 3D models from camera triplets. In: CVPR 2009, pp. 2874–2881 (2009)Google Scholar
  15. 15.
    Philbin, J., Chum, O., Isard, M., Sivic, J., Zisserman, A.: Object retrieval with large vocabularies and fast spatial matching. In: CVPR 2007 (2007)Google Scholar
  16. 16.
    Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (SURF). CVIU 110, 346–359 (2008)Google Scholar
  17. 17.
    Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman, New York (1979)zbMATHGoogle Scholar
  18. 18.
    Nister, D.: A minimal solution to the generalized 3-point pose problem. In: CVPR 2004, pp. I: 560–567 (2004)Google Scholar
  19. 19.
    Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60, 91–110 (2004)CrossRefGoogle Scholar
  20. 20.
    Schweighofer, G., Pinz, A.: Globally optimal O(n) solution to the PnP problem for general camera models. In: BMVC 2008 (2008)Google Scholar
  21. 21.
    Sturm, J.: SeDuMi: A software package to solve optimization problems (2006),
  22. 22.
    Lourakis, M., Argyros, A.: The design and implementation of a generic sparse bundle adjustment software package based on the Levenberg-Marquardt algorithm. Tech. Report 340, Institute of Computer Science – FORTH (2004)Google Scholar
  23. 23.
    Umeyama, S.: Least-squares estimation of transformation parameters between two point patterns. PAMI 13, 376–380 (1991)Google Scholar
  24. 24.
    Yahoo!: Flickr: Online photo management and photo sharing application (2005),
  25. 25.
    Mičušík, B., Pajdla, T.: Structure from motion with wide circular field of view cameras. PAMI 28, 1135–1149 (2006)Google Scholar
  26. 26.
    Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: BMVC 2002, pp. 384–393 (2002)Google Scholar
  27. 27.
    Irschara, A., Zach, C., Bischof, H.: Towards wiki-based dense city modeling. In: Virtual Representations and Modeling of Large-scale environments, VRML (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Michal Havlena
    • 1
  • Akihiko Torii
    • 1
    • 2
  • Tomáš Pajdla
    • 1
  1. 1.Center for Machine Perception, Department of Cybernetics, Faculty of Elec. Eng.Czech Technical University in PraguePrague 6Czech Republic
  2. 2.Tokyo Institute of TechnologyTokyoJapan

Personalised recommendations