International Journal of Computer Vision

, Volume 124, Issue 1, pp 2–17 | Cite as

A TV Prior for High-Quality Scalable Multi-View Stereo Reconstruction

  • Andreas KuhnEmail author
  • Heiko Hirschmüller
  • Daniel Scharstein
  • Helmut Mayer


We present a scalable multi-view stereo method able to reconstruct accurate 3D models from hundreds of high-resolution input images. Local fusion of disparity maps obtained with semi-global matching enables the reconstruction of large scenes that do not fit into main memory. Since disparity maps may vary widely in quality and resolution, careful modeling of the 3D errors is crucial. We derive a sound stereo error model based on disparity uncertainty, which can vary spatially from tenths to several pixels. We introduce a feature based on total variation that allows pixel-wise classification of disparities into different error classes. For each class, we learn a disparity error distribution from ground-truth data using expectation maximization. We present a novel method for stochastic fusion of data with varying quality by adapting a multi-resolution volumetric fusion process that uses our error classes as a prior and models surface probabilities via an octree of voxels. Conflicts during surface extraction are resolved using visibility constraints and preference for voxels at higher resolutions. Experimental results on several challenging large-scale datasets demonstrate that our method yields improved performance both qualitatively and quantitatively.


Multi-View Stereo 3D Modeling Scalable 3D Surface Reconstruction 

Supplementary material

11263_2016_946_MOESM_ESM.pdf (11.4 mb)
Supplementary material 1 (pdf 11706 KB)


  1. Bailer, C., Finckh, M., & Lensch, H. (2012). Scale robust multi view stereo. In ECCV.Google Scholar
  2. Bao, S., Chandraker, M., Lin, Y., Savarese, S. (2013). Dense object reconstruction with semantic priors. In CVPR.Google Scholar
  3. Bodenmüller, T. (2009). Streaming surface reconstruction from real time 3D measurements. Ph.D. thesis, Technical University Munich.Google Scholar
  4. Curless, B., & Levoy, M. (1996). A volumetric method for building complex models from range images. In SIGGRAPH.Google Scholar
  5. Frahm, J.M., Georgel, P., Gallup, D., Johnson, T., Raguram, R., Wu, C., Jen, Y. H., Dunn, E., Clipp, B., Lazebnik, S., & Pollefeys, M. (2010). Building Rome on a cloudless day. In ECCV.Google Scholar
  6. Fuhrmann, S., & Goesele, M. (2011). Fusion of depth maps with multiple scales. In SIGGRAPH Asia.Google Scholar
  7. Fuhrmann, S., & Goesele, M. (2014). Floating scale surface reconstruction. In SIGGRAPH.Google Scholar
  8. Furukawa, R., Itano, T., Morisaka, A., & Kawasaki, H. (2007). Improved space carving method for merging and interpolating multiple range images using information of light sources of active stereo. In ACCV.Google Scholar
  9. Furukawa, Y., & Ponce, J. (2010). Accurate, dense, and robust multiview stereopsis. PAMI, 32, 1362–1376.CrossRefGoogle Scholar
  10. Goesele, M., Curless, B., & Seitz, S. (2006). Multi-view stereo revisited. In CVPR.Google Scholar
  11. Goesele, M., Snavely, N., Curless, B., Hoppe, H., & Seitz, S. (2007). Multi-view stereo for community photo collections. In ICCV.Google Scholar
  12. Häne, C., Zach, C., Cohen, A., Angst, R., & Pollefeys, M. (2013). Joint 3D scene reconstruction and class segmentation. In CVPR.Google Scholar
  13. Hernández, C., Vogiatzis, G., & Cipolla, R. (2007). Probabilistic visibility for multi-view stereo. In CVPR.Google Scholar
  14. Hirschmüller, H. (2008). Stereo processing by semi-global matching and mutual information. PAMI, 30, 328–341.CrossRefGoogle Scholar
  15. Hirschmüller, H., & Scharstein, D. (2009). Evaluation of stereo matching costs on images with radiometric differences. PAMI, 31, 1582–1599.CrossRefGoogle Scholar
  16. Hu, X., Mordohai, P. (2012). Least commitment, viewpoint-based, multi-view stereo. In 3DIMPVT.Google Scholar
  17. Kazhdan, M., Bolitho, M., Hoppe, H. (2006). Poisson surface reconstruction. In Eurographics.Google Scholar
  18. Kazhdan, M., Klein, A., Dalal, K., Hoppe, H. (2007). Unconstrained isosurface extraction on arbitrary octrees. In Eurographics.Google Scholar
  19. Kolev, K., Klodt, M., Brox, T., & Cremers, D. (2009). Continuous global optimization in multiview 3D reconstruction. IJCV, 84, 80–96.CrossRefGoogle Scholar
  20. Kuhn, A. (2014). Scalable 3D surface reconstruction by local stochastic fusion of disparity maps. Ph.D. thesis, University of the Bundeswehr.Google Scholar
  21. Kuhn, A., Hirschmüller, H., & Mayer, H. (2013). Multi-resolution range data fusion for multi-view stereo reconstruction. In GCPR.Google Scholar
  22. Kuhn, A., & Mayer, H. (2015). Incremental division of very large point clouds for scalable 3D surface reconstruction. In ICCV Workshop (ICCVW).Google Scholar
  23. Kuhn, A., Mayer, H., Hirschmüller, H., & Scharstein, D. (2014). A TV prior for high-quality local multi-view stereo reconstruction. In 3DV.Google Scholar
  24. Mayer, H., Bartelsen, J., Hirschmüller, H., & Kuhn, A. (2011). Dense 3D reconstruction from wide baseline image sets. In 15th International Workshop on Theoretical Foundations of Computer Vision.Google Scholar
  25. Merrell, P., Akbarzadeh, A., Wang, L., Mordohai, P., Frahm, J. M., Yang, R., Nistér, D., Pollefeys, M. (2007). Real-time visibility-based fusion of depth maps. In CVPR.Google Scholar
  26. Molton, N., & Brady, M. (2000). Practical structure and motion from stereo when motion is unconstrained. IJCV, 39(1), 5–23.Google Scholar
  27. Mücke, P., Klowsky, R., & Goesele, M. (2011). Surface reconstruction from multi-resolution sample points. In VMV.Google Scholar
  28. Newcombe, R., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A., Kohli, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking. In ISMAR.Google Scholar
  29. Ochs, P., Dosovitskiy, A., Brox, T., & Pock, T. (2013). An iterated L1 algorithm for non-smooth non-convex optimization in computer vision. In CVPR.Google Scholar
  30. Pathak, K., Birk, A., & Schwertfeger, S. (2007). 3D forward sensor modeling and application to occupancy grid based sensor fusion. In IROS.Google Scholar
  31. Rudin, L., Osher, S., & Fatemi, E. (1992). Nonlinear total variation based noise removal algorithms. Physica D, 60(1), 259–268.MathSciNetCrossRefzbMATHGoogle Scholar
  32. Sagawa, R., Nishino, K., & Ikeuchi, K. (2005). Adaptively merging large-scale range data with reflectance properties. PAMI, 27(3), 392–405.CrossRefGoogle Scholar
  33. Scharstein, D., Hirschmüller, H., Kitajima, Y., Krathwohl, G., Nesic, N., Wang, X., & Westling, P. (2014). High-resolution stereo datasets with subpixel-accurate ground truth. In GCPR.Google Scholar
  34. Scharstein, D., & Pal, C. (2007). Learning conditional random fields in stereo. In CVPR.Google Scholar
  35. Schroers, C., Zimmer, H., Valgaerts, L., Bruhn, A., Demetz, O., & Weickert, J. (2012). Anisotropic range image integration. In DAGM.Google Scholar
  36. Seitz, S., Curless, B., Diebel, J., Scharstein, D.,&Szeliski, R. (2006). A comparison and evaluation of multi-view stereo reconstruction algorithms. In CVPR.Google Scholar
  37. Sinha, S., Scharstein, D., & Szeliski, R. (2014). Efficient high-resolution stereo matching using local plane sweeps. In CVPR.Google Scholar
  38. Steinbrücker, F., Kerl, C., Sturm, J., & Cremers, D. (2013). Large-scale multi-resolution surface reconstruction from RGB-D sequences. In ICCV.Google Scholar
  39. Strecha, C., von Hansen, W., Van Gool, L., Fua, P., Thoennessen, U. (2008). On benchmarking camera calibration and multi-view stereo for high resolution imagery. In CVPR.Google Scholar
  40. Thrun, S. (2003). Learning occupancy grid maps with forward sensor models. Autonomous Robots, 15, 111–127.CrossRefGoogle Scholar
  41. Vogiatzis, G., & Hernández, C. (2011). Video-based, real-time multi-view stereo. Image and Vision Computing, 29, 434–441.CrossRefGoogle Scholar
  42. Vu, H. H., Labatut, P., Pons, J. P., & Keriven, R. (2012). High accuracy and visibility-consistent dense multiview stereo. PAMI, 34, 889–901.CrossRefGoogle Scholar
  43. Wei, J., Resch, B., Lensch, H. (2014). Multi-view depth map estimation with cross-view consistency. In BMVC.Google Scholar
  44. Wheeler, M., Sato, Y., Ikeuchi, K. (1998). Consensus surfaces for modeling 3D objects from multiple range images. In ICCV.Google Scholar
  45. Woodford, O., & Vogiatzis, G. (2012). A generative model for online depth fusion. In ECCV.Google Scholar
  46. Wu, C. (2013). Towards linear-time incremental structure from motion. In 3DV.Google Scholar
  47. Wu, C., Agarwal, S., Curless, B., & Seitz, S. (2011). Multicore bundle adjustment. In CVPR.Google Scholar
  48. Xiong, Y., & Matthies, L. (1997). Error analysis of a real-time stereo system. In CVPR.Google Scholar
  49. Zach, C. (2008). Fast and high quality fusion of depth maps. In 3DPVT.Google Scholar
  50. Zach, C., Pock, T., & Bischof, H. (2007). A globally optimal algorithm for robust TV-L1 range image integration. In ICCV.Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Bundeswehr University MunichNeubibergGermany
  2. 2.German Aerospace CentreMunichGermany
  3. 3.Roboception GmbHMunichGermany
  4. 4.Middlebury CollegeMiddleburyUSA

Personalised recommendations