Stereo Matching—State-of-the-Art and Research Challenges

Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)


Stereo matching denotes the problem of finding dense correspondences in pairs of images in order to perform 3D reconstruction. In this chapter, we provide a review of stereo methods with a focus on recent developments and our own work. We start with a discussion of local methods and introduce our algorithms: geodesic stereo, cost filtering and PatchMatch stereo. Although local algorithms have recently become very popular, they are not capable of handling large untextured regions where a global smoothness prior is required. In the discussion of such global methods, we briefly describe standard optimization techniques. However, the real problem is not in the optimization, but in finding an energy function that represents a good model of the stereo problem. In this context, we investigate data and smoothness terms of standard energies to find the best-suited implementations of which. We then describe our own work on finding a good model. This includes our combined stereo and matting approach, Surface Stereo, Object Stereo as well as a new method that incorporates physics-based reasoning in stereo matching.


Stereo Match Global Method View Synthesis Smoothness Term Disparity Information 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported in part by the Vienna Science and Technology Fund (WWTF) under project ICT08-019.


  1. 1.
    Agarwal S, Snavely N, Simon I, Seitz S, Szeliski R (2009) Building Rome in a day. In: ICCV Google Scholar
  2. 2.
    Baker S, Szeliski R, Anandan P (1998) A layered approach to stereo reconstruction. In: CVPR, pp 434–441 Google Scholar
  3. 3.
    Baker S, Scharstein D, Lewis J, Roth S, Black M, Szeliski R (2011) A database and evaluation methodology for optical flow. Int J Comput Vis 92(1):1–31 CrossRefGoogle Scholar
  4. 4.
    Barnes C, Shechtman E, Finkelstein A, Goldman D (2009) PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28(3):24 (SIGGRAPH Proc) CrossRefGoogle Scholar
  5. 5.
    Birchfield S, Tomasi C (1998) A pixel dissimilarity measure that is insensitive to image sampling. IEEE Trans Pattern Anal Mach Intell 20(4):401–406 CrossRefGoogle Scholar
  6. 6.
    Birchfield S, Tomasi C (1999) Depth discontinuities by pixel-to-pixel stereo. Int J Comput Vis 35(3):269–293 CrossRefGoogle Scholar
  7. 7.
    Bleyer M, Chambon S (2010) Does color really help in dense stereo matching? In: International symposium on 3D data processing, visualization and transmission (3DPVT) Google Scholar
  8. 8.
    Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints. ISPRS J Photogramm Remote Sens 59(3):128–150 CrossRefGoogle Scholar
  9. 9.
    Bleyer M, Gelautz M (2007) Graph-cut-based stereo matching using image segmentation with symmetrical treatment of occlusions. Signal Process Image Commun 22(2):127–143 CrossRefGoogle Scholar
  10. 10.
    Bleyer M, Gelautz M (2008) Simple but effective tree structures for dynamic programming-based stereo matching. In: VISAPP, vol 2, pp 415–422 Google Scholar
  11. 11.
    Bleyer M, Chambon S, Poppe U, Gelautz M (2008) Evaluation of different methods for using colour information in global stereo matching. In: International archives of the photogrammetry, remote sensing and spatial information sciences, vol XXXVII, pp 415–422 Google Scholar
  12. 12.
    Bleyer M, Gelautz M, Rother C, Rhemann C (2009) A stereo approach that handles the matting problem via image warping. In: CVPR, pp 501–508 Google Scholar
  13. 13.
    Bleyer M, Rother C, Kohli P (2010) Surface stereo with soft segmentation. In: CVPR Google Scholar
  14. 14.
    Bleyer M, Rhemann C, Rother C (2011) PatchMatch stereo—stereo matching with slanted support windows. In: BMVC Google Scholar
  15. 15.
    Bleyer M, Rother C, Kohli P, Scharstein D, Sinha S (2011) Object stereo—joint stereo matching and object segmentation. In: CVPR Google Scholar
  16. 16.
    Bleyer M, Rhemann C, Rother C (2012) Extracting 3D scene-consistent object proposals and depth from stereo images. In: ECCV Google Scholar
  17. 17.
    Bobick A, Intille S (1999) Large occlusion stereo. Int J Comput Vis 33(3):181–200 CrossRefGoogle Scholar
  18. 18.
    Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239 CrossRefGoogle Scholar
  19. 19.
    Carreira J, Li F, Sminchisescu C (2012) Object recognition by sequential figure-ground ranking. Int J Comput Vis 98(3):243–262 MathSciNetCrossRefGoogle Scholar
  20. 20.
    Deng Y, Yang Q, Lin X, Tang X (2005) A symmetric patch-based correspondence model for occlusion handling. In: ICCV, pp 542–567 Google Scholar
  21. 21.
    Faugeras O, Hotz B, Mathieu H, Viéville T, Zhang Z, Fua P, Théron E, Moll L, Berry G, Vuillemin J, Bertin P, Proy C (1996) Real time correlation based stereo: algorithm implementations and applications. Technical report, RR-2013, INRIA Google Scholar
  22. 22.
    Felzenszwalb P, Huttenlocher D (2006) Efficient belief propagation for early vision. Int J Comput Vis 70(1):41–54 CrossRefGoogle Scholar
  23. 23.
    Fua PV (1991) Combining stereo and monocular information to compute dense depth maps that preserve depth discontinuities. In: International joint conference on artificial intelligence, pp 1292–1298 Google Scholar
  24. 24.
    Fusiello A, Roberto V, Trucco E (1997) Efficient stereo with multiple windowing. In: CVPR, pp 858–863 Google Scholar
  25. 25.
    Gallup D, Frahm J, Mordohai P, Yang Q, Pollefeys M (2007) Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR Google Scholar
  26. 26.
    Gehrig S, Franke U (2007) Improving sub-pixel accuracy for long range stereo. In: ICCV VRML workshop Google Scholar
  27. 27.
    Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The KITTI vision benchmark suite. In: CVPR Google Scholar
  28. 28.
    Gupta A, Efros A, Hebert M (2010) Blocks world revisited: image understanding using qualitative geometry and mechanics. In: ECCV Google Scholar
  29. 29.
    Hartley R, Zisserman A (2003) Multiple view geometry in computer vision Google Scholar
  30. 30.
    Hasinoff S, Kang SB, Szeliski R (2006) Boundary matting for view synthesis. Comput Vis Image Underst 103(1):22–32 CrossRefGoogle Scholar
  31. 31.
    He K, Sun J, Tang X (2010) Guided image filtering. In: ECCV Google Scholar
  32. 32.
    Hirschmüller H (2005) Accurate and efficient stereo processing by semi-global matching and mutual information. In: CVPR, vol 2, pp 807–814 Google Scholar
  33. 33.
    Hirschmüller H, Scharstein D (2009) Evaluation of stereo matching costs on images with radiometric differences. IEEE Trans Pattern Anal Mach Intell 31:1582–1599 CrossRefGoogle Scholar
  34. 34.
    Hirschmüller H, Innocent P, Garibaldi J (2002) Real-time correlation-based stereo vision with reduced border errors. Int J Comput Vis 47:229–246 CrossRefMATHGoogle Scholar
  35. 35.
    Hong L, Chen G (2004) Segment-based stereo matching using graph cuts. In: CVPR, vol 1, pp 74–81 Google Scholar
  36. 36.
    Hosni A, Bleyer M, Gelautz M, Rhemann C (2009) Local stereo matching using geodesic support weights. In: ICIP Google Scholar
  37. 37.
    Hosni A, Bleyer M, Gelautz M (2010) Near real-time stereo with adaptive support weight approaches. In: 3DPVT Google Scholar
  38. 38.
    Hosni A, Rhemann C, Bleyer M, Gelautz M (2011) Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In: PSIVT Google Scholar
  39. 39.
    Hosni A, Rhemann C, Bleyer M, Rother C, Gelautz M (2013) Fast cost-volume filtering for visual correspondence and beyond. IEEE Trans Pattern Anal Mach Intell 35(2):504–511 CrossRefGoogle Scholar
  40. 40.
    Ishikawa H (2000) Global optimization using embedded graphs. PhD thesis, New York University Google Scholar
  41. 41.
    Jegelka S, Bilmes J (2011) Submodularity beyond submodular energies: coupling edges in graph cuts. In: CVPR Google Scholar
  42. 42.
    Ju M, Kang H (2009) Constant time stereo matching. In: MVIP, pp 13–17 Google Scholar
  43. 43.
    Klaus A, Sormann M, Karner K (2006) Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In: ICPR, pp 15–18 Google Scholar
  44. 44.
    Kohli P, Kumar M, Torr P (2007) P3 & beyond: solving energies with higher order cliques. In: CVPR Google Scholar
  45. 45.
    Kolmogorov V, Rother C (2007) Minimizing non-submodular functions with graph cuts—a review. IEEE Trans Pattern Anal Mach Intell 29(7):1274–1279 CrossRefGoogle Scholar
  46. 46.
    Kolmogorov V, Zabih R (2002) Computing visual correspondence with occlusions using graph cuts. In: ICCV, vol 2, pp 508–515 Google Scholar
  47. 47.
    Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. In: ECCV Google Scholar
  48. 48.
    Kolmogorov V, Zabih R (2004) What energy functions can be minimized via graph cuts? IEEE Trans Pattern Anal Mach Intell 26(2):147–159 CrossRefGoogle Scholar
  49. 49.
    Krähenbühl P, Koltun V (2011) Efficient inference in fully connected CRFs with gaussian edge potentials. In: Advances in neural information processing systems Google Scholar
  50. 50.
    Lempitsky V, Rother C, Blake A (2007) Logcut—efficient graph cut optimization for Markov random fields. In: ICCV Google Scholar
  51. 51.
    Li G, Zucker SW (2006) Surface geometric constraints for stereo in belief propagation. In: CVPR, pp 2355–2362 Google Scholar
  52. 52.
    Lin M, Tomasi C (2003) Surfaces with occlusions from layered stereo. In: CVPR, pp 710–717 Google Scholar
  53. 53.
    Mei X, Sun X, Zhou M, Jiao S, Wang H, Zhang X (2011) On building an accurate stereo matching system on graphics hardware. In: GPUCV, pp 467–474 Google Scholar
  54. 54.
    Meltzer T, Yanover TC, Weiss Y (2005) Globally optimal solutions for energy minimization in stereo vision using reweighted belief propagation. In: ICCV, pp 428–435 Google Scholar
  55. 55.
    Mühlmann K, Maier D, Hesser J, Männer R (2002) Calculating dense disparity maps from color stereo images, an efficient implementation. Int J Comput Vis 47(1):79–88 CrossRefMATHGoogle Scholar
  56. 56.
    Newcombe R, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A, Kohli P, Shotton J, Hodges S, Fitzgibbon A (2011) KinectFusion: real-time dense surface mapping and tracking. In: ISMAR Google Scholar
  57. 57.
    Ogale AS, Aloimonos Y (2004) Stereo correspondence with slanted surfaces: critical implications of horizontal slant. In: CVPR, pp 568–573 Google Scholar
  58. 58.
    Ohta Y, Kanade T (1985) Stereo by intra- and inter-scanline search. IEEE Trans Pattern Anal Mach Intell 7(2):139–154 CrossRefGoogle Scholar
  59. 59.
    Paris S, Durandi F (2009) A fast approximation of the bilateral filter using a signal processing approach. Int J Comput Vis 81:24–52 CrossRefGoogle Scholar
  60. 60.
    Porikli F (2005) Integral histogram: a fast way to extract histograms in cartesian spaces. In: CVPR, vol 1, pp 829–836 Google Scholar
  61. 61.
    Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: CVPR Google Scholar
  62. 62.
    Richardt C, Orr D, Davies I, Criminisi A, Dodgson NA (2010) Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In: ECCV, vol 6313, pp 510–523 Google Scholar
  63. 63.
    Rother C, Kolmogorov V, Blake A (2004) GrabCut: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23:309–314 CrossRefGoogle Scholar
  64. 64.
    Rother C, Kolmogorov V, Lempitsky V, Szummer M (2007) Optimizing binary MRFs via extended roof duality. In: CVPR Google Scholar
  65. 65.
    Rother C, Kohli P, Feng W, Jia J (2009) Minimizing sparse higher order energy functions of discrete variables. In: CVPR, pp 1382–1389 Google Scholar
  66. 66.
    Roy S, Cox I (1998) A maximum-flow formulation of the n-camera stereo correspondence problem. In: ICCV, pp 492–499 Google Scholar
  67. 67.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1/2/3):7–42. CrossRefMATHGoogle Scholar
  68. 68.
    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2011) Real-time human pose recognition in parts from a single depth image. In: CVPR Google Scholar
  69. 69.
    Smith B, Zhang L, Jin H (2009) Stereo matching with nonparametric smoothness priors in feature space. In: CVPR, pp 485–492 Google Scholar
  70. 70.
    Snavely N, Seitz S, Szeliski R (2006) Photo tourism: exploring photo collections in 3D. ACM Trans Graph 25:835–846 (SIGGRAPH Proc) CrossRefGoogle Scholar
  71. 71.
    Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation. IEEE Trans Pattern Anal Mach Intell 25(7):787–800 CrossRefGoogle Scholar
  72. 72.
    Sun J, Li Y, Kang SB, Shum HY (2005) Symmetric stereo matching for occlusion handling. In: CVPR, vol 25, pp 399–406 Google Scholar
  73. 73.
    Szeliski R, Golland P (1998) Stereo matching with transparency and matting. In: ICCV, pp 517–525 Google Scholar
  74. 74.
    Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2006) A comparative study of energy minimization methods for Markov random fields. In: ECCV, vol 2, pp 19–26 Google Scholar
  75. 75.
    Taguchi Y, Wilburn B, Zitnick L (2008) Stereo reconstruction with mixed pixels using adaptive over-segmentation. In: CVPR, pp 1–8 Google Scholar
  76. 76.
    Tao H, Sawhney H, Kumar R (2001) A global matching framework for stereo computation. In: ICCV, pp 532–539 Google Scholar
  77. 77.
    Tappen MF, Freeman WT (2003) Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters. In: ICCV, vol 2, pp 900–906 Google Scholar
  78. 78.
    Veksler O (2002) Stereo correspondence with compact windows via minimum ratio cycle. IEEE Trans Pattern Anal Mach Intell 24(12):1654–1660 CrossRefGoogle Scholar
  79. 79.
    Veksler O (2005) Stereo correspondence by dynamic programming on a tree. In: CVPR, pp 384–390 Google Scholar
  80. 80.
    Wainwright M, Jaakkola T, Willsky A (2003) Tree reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching. In: AISTATS Google Scholar
  81. 81.
    Woodford O, Torr P, Reid I, Fitzgibbon A (2008) Global stereo reconstruction under second order smoothness priors. In: CVPR Google Scholar
  82. 82.
    Xiong W, Jia J (2007) Stereo matching on objects with fractional boundary. In: CVPR, pp 1–8 Google Scholar
  83. 83.
    Yang Q, Wang L, Yang R, Wang S, Liao M, Nister D (2006) Real-time global stereo matching using hierarchical belief propagation. In: BMVC Google Scholar
  84. 84.
    Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: CVPR Google Scholar
  85. 85.
    Yang Q, Wang L, Yang R, Stewenius H, Nister D (2009) Stereo matching with color-weighted correlation, hierarchical belief propagation and occlusion handling. IEEE Trans Pattern Anal Mach Intell 31(3):492–504 CrossRefGoogle Scholar
  86. 86.
    Yoon KJ, Kweon IS (2005) Locally adaptive support-weight approach for visual correspondence search. In: CVPR Google Scholar
  87. 87.
    Zhang Y, Gong M, Yang Y (2008) Local stereo matching with 3D adaptive cost aggregation for slanted surface modeling and sub-pixel accuracy. In: ICPR Google Scholar
  88. 88.
    Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images. IEEE Trans Circuits Syst Video Technol 19:1073–1079 CrossRefGoogle Scholar
  89. 89.
    Zhang K, Lafruit G, Lauwereins R, Gool L (2010) Joint integral histograms and its application in stereo matching. In: ICIP, pp 817–820 Google Scholar
  90. 90.
    Zitnick L, Kang S, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. ACM Trans Graph 23(3):600–608 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2013

Authors and Affiliations

  1. 1.Vienna University of TechnologyViennaAustria
  2. 2.Microsoft RedmondRedmondUSA

Personalised recommendations