Multimedia Tools and Applications

, Volume 77, Issue 24, pp 31647–31663 | Cite as

Adaptive disparity computation using local and non-local cost aggregations

  • Qicong Dong
  • Jieqing FengEmail author


A new method is proposed to adaptively compute the disparity of stereo matching by choosing one of the alternative disparities from local and non-local disparity maps. The initial two disparity maps can be obtained from state-of-the-art local and non-local stereo algorithms. Then, the more reasonable disparity is selected. We propose two strategies to select the disparity. One is based on the magnitude of the gradient in the left image, which is simple and fast. The other utilizes the fusion move to combine the two proposal labelings (disparity maps) in a theoretically sound manner, which is more accurate. Finally, we propose a texture-based sub-pixel refinement to refine the disparity map. Experimental results using Middlebury datasets demonstrate that the two proposed selection strategies both perform better than individual local or non-local algorithms. Moreover, the proposed method is compatible with many local and non-local algorithms that are widely used in stereo matching.


Stereo matching Adaptive disparity computation Fusion move Disparity selection Texture-based sub-pixel refinement 



The authors would like to thank Qing Ran for her instructive discussion of this paper. This work was supported by the National Natural Science Foundation of China under Grants Nos. 61732015 and 61472349.


  1. 1.
    Bleyer M, Rhemann C, Rother C (2011) Patchmatch stereo - stereo matching with slanted support windows: In: British machine vision conference, pp 14.1–14.11Google Scholar
  2. 2.
    Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239CrossRefGoogle Scholar
  3. 3.
    Brockers R, Hund M, Mertsching B (2005) Stereo vision using cost-relaxation with 3d support regions. Cortex 9:11Google Scholar
  4. 4.
    Crouzil A, Massip-Pailhes L, Castan S (1996) A new correlation criterion based on gradient fields similarity. In: International conference on pattern recognition, vol 1. IEEE, pp 632–636Google Scholar
  5. 5.
    Drouyer S, Beucher S, Bilodeau M, Moreaud M, Sorbier L (2017) Sparse stereo disparity map densification using hierarchical image segmentation. In: International symposium on mathematical morphology and its applications to signal and image processing, pp 172–184Google Scholar
  6. 6.
    Ghaleb FF, Youness EA, Elmezain M, Dewdar FS (2015) Vision-based hand gesture spotting and recognition using crf and svm. J Softw Eng Appl 8(07):313CrossRefGoogle Scholar
  7. 7.
    Hirschmuller H (2005) Accurate and efficient stereo processing by semi-global matching and mutual information. In: Computer vision and pattern recognition, vol 2. IEEE, pp 807–814Google Scholar
  8. 8.
    Huang X, Yuan C, Zhang J (2015) Graph cuts stereo matching based on patch-match and ground control points constraint. In: Pacific rim conference on multimedia, Springer, pp 14–23Google Scholar
  9. 9.
    Jiao J, Wang R, Wang W, Dong S, Wang Z, Gao W (2014) Local stereo matching with improved matching cost and disparity refinement. IEEE MultiMedia 21 (4):16–27CrossRefGoogle Scholar
  10. 10.
    Kim KR, Kim CS (2016) Adaptive smoothness constraints for efficient stereo matching using texture and edge information. In: IEEE International conference on image processing, pp 3429–3433Google Scholar
  11. 11.
    Kolmogorov V, Rother C (2007) Minimizing nonsubmodular functions with graph cuts-a review. IEEE Trans Pattern Anal Mach Intell 29(7):1274CrossRefGoogle Scholar
  12. 12.
    Kolmogorov V, Zabih R (2001) Computing visual correspondence with occlusions using graph cuts. In: IEEE Conference on computer vision, vol 2. IEEE, pp 508–515Google Scholar
  13. 13.
    Kong D, Tao H (2004) A method for learning matching errors for stereo computation. In: British machine vision conference, vol 1, p 2Google Scholar
  14. 14.
    Lempitsky V, Rother C, Roth S, Blake A (2010) Fusion moves for markov random field optimization. IEEE Trans Pattern Anal Mach Intell 32(8):1392CrossRefGoogle Scholar
  15. 15.
    Li L, Zhang S, Yu X, Zhang L (2016) Pmsc: Patchmatch-based superpixel cut for accurate stereo matching. IEEE Transactions on Circuits and Systems for Video TechnologyGoogle Scholar
  16. 16.
    Li L, Yu X, Zhang S, Zhao X, Zhang L (2017) 3d cost aggregation with multiple minimum spanning trees for stereo matching. Applied OpticsGoogle Scholar
  17. 17.
    Mei X, Sun X, Zhou M, Jiao S, Wang H, Zhang X (2011) On building an accurate stereo matching system on graphics hardware. In: IEEE Conference on computer vision, IEEE, pp 467–474Google Scholar
  18. 18.
    Mei X, Sun X, Dong W, Wang H, Zhang X (2013) Segment-tree based cost aggregation for stereo matching. In: Computer vision and pattern recognition, pp 313–320Google Scholar
  19. 19.
    Miyazaki D, Matsushita Y, Ikeuchi K (2009) Interactive shadow removal from a single image using hierarchical graph cut pp 234–245Google Scholar
  20. 20.
    Mizukami Y, Okada K, Nomura A, Nakanishi S (2012) Sub-pixel disparity search for binocular stereo vision. In: International conference on pattern recognition, pp 364–367Google Scholar
  21. 21.
    Narducci F, Ricciardi S, Vertucci R (2016) Enabling consistent hand-based interaction in mixed reality by occlusions handling. Multimedia Tools and Applications 75(16):9549–9562CrossRefGoogle Scholar
  22. 22.
    Ogawara K (2010) Approximate belief propagation by hierarchical averaging of outgoing messages. In: International conference on pattern recognition, pp 1368–1372Google Scholar
  23. 23.
    Olsson C, Ulen J, Boykov Y (2013) In defense of 3d-label stereo. In: Computer vision and pattern recognition, pp 1730–1737Google Scholar
  24. 24.
    Ošep A, Hermans A, Engelmann F, Klostermann D, Mathias M, Leibe B (2016) Multi-scale object candidates for generic object tracking in street scenes. In: 2016 ieee international conference on Robotics and automation (icra), IEEE, pp 3180–3187Google Scholar
  25. 25.
    Park H, Lee KM (2016) Look wider to match image patches with convolutional neural networks. IEEE Signal Processing LettersGoogle Scholar
  26. 26.
    Park M, Yoon K (2016) As-planar-as-possible depth map estimation. IEEE Transactions Pattern AnalGoogle Scholar
  27. 27.
    Peng Y, Li G, Wang R, Wang W (2015) Stereo matching with space-constrained cost aggregation and segmentation-based disparity refinement. In: Three-dimensional image processing, measurement (3DIPM), and applications, p 939309Google Scholar
  28. 28.
    Psota ET, Kowalczuk J, Mittek M, Prez LC (2016) Map disparity estimation using hidden markov trees. In: IEEE International conference on computer visionGoogle Scholar
  29. 29.
    Rameau F, Ha H, Joo K, Choi J, Park K, Kweon IS (2016) A real-time augmented reality system to see-through cars. IEEE Trans Vis Comput Graph 22(11):2395–2404CrossRefGoogle Scholar
  30. 30.
    Rhemann C, Hosni A, Bleyer M, Rother C, Gelautz M (2011) Fast cost-volume filtering for visual correspondence and beyond. In: Computer vision and pattern recognition, pp 3017–3024Google Scholar
  31. 31.
    Scharstein D (1994) Matching images by comparing their gradient fields. International conference on pattern recognition, vol 1. IEEE, pp 572–575Google Scholar
  32. 32.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vis 47(1-3):7–42CrossRefGoogle Scholar
  33. 33.
    Scharstein D, Hirschmüller H, Kitajima Y, Krathwohl G, Nešić N, Wang X, Westling P (2014) High-resolution stereo datasets with subpixel-accurate ground truth. In: German conference on pattern recognition, Springer, pp 31–42Google Scholar
  34. 34.
    Shu X, Qi GJ, Tang J, Wang J (2015) Weakly-shared deep transfer networks for heterogeneous-domain knowledge propagation. In: ACM International conference on multimedia, pp 35–44Google Scholar
  35. 35.
    Suarez J, Murphy RR (2012) Hand gesture recognition with depth images: a review. In: Ro-man, 2012 IEEE, IEEE, pp 411–417Google Scholar
  36. 36.
    Tan P, Monasse P (2014) Stereo disparity through cost aggregation with guided filter. Image Processing on Line, 4:252–275. CrossRefGoogle Scholar
  37. 37.
    Taniai T, Matsushita Y, Naemura T (2014) Graph cut based continuous stereo matching using locally shared labels. In: Computer vision and pattern recognition, pp 1613–1620Google Scholar
  38. 38.
    Taniai T, Matsushita Y, Sato Y, Naemura T (2016) Continuous stereo matching using local expansion moves. Computer Vision and Pattern RecognitionGoogle Scholar
  39. 39.
    Tian Y, Long Y, Xia D, Yao H, Zhang J (2015) Handling occlusions in augmented reality based on 3d reconstruction method. Neurocomputing 156:96–104CrossRefGoogle Scholar
  40. 40.
    Ummenhofer B, Zhou H, Uhrig J, Mayer N, Ilg E, Dosovitskiy A, Brox T (2017) Demon: Depth and motion network for learning monocular stereo. In: IEEE Conference on computer vision and pattern recognition (CVPR), vol 5Google Scholar
  41. 41.
    Vu DT, Chidester B, Yang H, Do MN, Lu J (2014) Efficient hybrid tree-based stereo matching with applications to postcapture image refocusing. IEEE Trans Image Process 23(8):3428–3442MathSciNetCrossRefGoogle Scholar
  42. 42.
    Wang L, Yang R, Gong M, Liao M (2014) Real-time stereo using approximated joint bilateral filtering and dynamic programming. J Real-Time Image Proc 9(3):447–461CrossRefGoogle Scholar
  43. 43.
    Woodford OJ, Torr PHS, Reid ID, Fitzgibbon AW (2008) Global stereo reconstruction under second order smoothness priors. In: Computer vision and pattern recognition, pp 1–8Google Scholar
  44. 44.
    Yang Q (2012) A non-local cost aggregation method for stereo matching. In: Computer vision and pattern recognition, IEEE, pp 1402–1409Google Scholar
  45. 45.
    Ye X, Li J, Wang H, Huang H, Zhang X (2017) Efficient stereo matching leveraging deep local and context information. IEEE AccessGoogle Scholar
  46. 46.
    Yoon KJ, Kweon IS (2006) Adaptive support-weight approach for correspondence search. IEEE Trans Pattern Anal Mach Intell 28(4):650–656CrossRefGoogle Scholar
  47. 47.
    Yu T, Lin RS, Super B, Tang B (2007) Efficient message representations for belief propagation. In: IEEE Conference on computer vision, IEEE, pp 1–8Google Scholar
  48. 48.
    Zabih R, Woodfill J (1994) Non-parametric local transforms for computing visual correspondence. In: European conference on computer vision, Springer, pp 151–158Google Scholar
  49. 49.
    Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17:1–32zbMATHGoogle Scholar
  50. 50.
    Zhan Y, Gu Y, Huang K, Zhang C, Hu K (2015) Accurate image-guided stereo matching with efficient matching cost and disparity refinement. IEEE Transactions on Circuits and Systems for Video TechnologyGoogle Scholar
  51. 51.
    Zhang C, Li Z, Cheng Y, Cai R (2015) Meshstereo: a global stereo model with mesh alignment regularization for view interpolation. In: IEEE International conference on computer vision, pp 2057–2065Google Scholar
  52. 52.
    Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images. IEEE Trans Circuits Syst Video Technol 19(7):1073–1079CrossRefGoogle Scholar
  53. 53.
    Zhou C, Zhang H, Shen X, Jia J (2017) Unsupervised learning of stereo matching. In: IEEE International conference on computer vision, pp 1576–1584Google Scholar
  54. 54.
    Zhou T, Brown M, Snavely N, Lowe DG (2017) Unsupervised learning of depth and ego-motion from video pp 6612–6619Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.State Key Lab of CAD, CGZhejiang UniversityHangzhouChina

Personalised recommendations