Advertisement

Oriented-linear-tree based cost aggregation for stereo matching

  • Wenhuan Wu
  • Hong Zhu
  • Qian Zhang
Article
  • 9 Downloads

Abstract

Matching cost aggregation is one of the most important steps in dense stereo correspondence, and non-local cost aggregation methods based on tree structures have been widely studied recently. In this paper, we analyze the shortcomings of both the local window-based aggregation methods and the non-local tree-based aggregation methods, and propose a novel oriented linear tree structure for each pixel to perform the non-local cost aggregation strategy. Firstly, each pixel in the image has an oriented linear tree rooted on it and each oriented linear tree consists of multiple 1D paths from different directions. Compared to other spanning trees, our oriented linear trees don’t need to be additionally constructed beforeh and since they are naturally embedded in the original image. Moreover, each root pixel not only gets supports from adjacent pixels within its local support window, but also receives supports from the other pixels along all 1D paths. Secondly, for each pixel lying on the same 1D path, we can at the same time calculate their aggregated cost along their path by traversing the path back and forth twice. Finally, the final aggregated cost for each root pixel can be calculated by summing the aggregated costs from all 1D paths. Performance evaluation on the Middlebury and KITTI datasets shows that the proposed method outperforms the current state-of-the-art aggregation methods.

Keywords

Stereo matching Cost aggregation Oriented linear tree Cost volume Edge-aware filtering 

Notes

Acknowledgements

This work was supported by National Natural Science Foundation of China (No.61673318, No.61703301, No.61771386, No.61801005); by Research project of Hubei Provincial Department of Education (B2017080), China.

References

  1. 1.
    Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints[J]. ISPRS J Photogramm Remote Sens 59(3):128–150CrossRefGoogle Scholar
  2. 2.
    Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts [J]. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239CrossRefGoogle Scholar
  3. 3.
    Cheng F, Zhang H, Sun M et al (2015) Cross-trees, edge and superpixel priors-based cost aggregation for stereo matching [J]. Pattern Recogn 48(7):2269–2278CrossRefGoogle Scholar
  4. 4.
    Cigla C, Alatan AA (2011) Efficient edge-preserving stereo matching[C]. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 696–699Google Scholar
  5. 5.
    Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis [J]. IEEE Trans Pattern Anal Mach Intell 24(5):603–619CrossRefGoogle Scholar
  6. 6.
    Fusiello A, Roberto V, Trucco E (2000) Symmetric stereo with multiple windowing[J]. Int J Pattern Recognit Artif Intell 14(08):1053–1066CrossRefGoogle Scholar
  7. 7.
    Geiger A, Ziegler J, Stiller C (2011) Stereoscan: dense 3d reconstruction in real-time[C]. In: Proceedings of the IEEE intelligent vehicles symposium. IEEE, pp 963–968Google Scholar
  8. 8.
    Geiger A, Lenz P, Urtasun R (2012) The KITTI vision benchmark. [Online]. Available: http://www.cvlibs.net/datasets/kitti/eval_stereo_flow.php?benchmark=stereo. Accessed March 2018
  9. 9.
    Gerrits M, Bekaert P (2006) Local stereo matching with segmentation-based outlier rejection[C]. In: Proceedings of the 3rd Canadian conference on computer and robot vision. IEEE, pp 66–66Google Scholar
  10. 10.
    Guney F, Geiger A (2015) Displets: resolving stereo ambiguities using object knowledge[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 4165–4175Google Scholar
  11. 11.
    Hafner D, Demetz O, Weickert J (2013) Why is the census transform good for robust optic flow computation?[C]. In: Proceedings of the international conference on scale space and variational methods in computer vision. Springer, pp 210–221Google Scholar
  12. 12.
    Hartley R, Zisserman A (2003) Multiple view geometry in computer vision[M]. Cambridge university press, CambridgezbMATHGoogle Scholar
  13. 13.
    He K, Sun J, Tang X (2013) Guided image filtering [J]. IEEE Trans Pattern Anal Mach Intell 35(6):1397–1409CrossRefGoogle Scholar
  14. 14.
    Hermann S, Klette R (2012) Iterative semi-global matching for robust driver assistance systems[C]. In: Proceedings of the Asian conference on computer vision. Springer, pp 465–478Google Scholar
  15. 15.
    Hirschmuller H (2008) Stereo processing by semiglobal matching and mutual information[J]. IEEE Trans Pattern Anal Mach Intell 30(2):328–341CrossRefGoogle Scholar
  16. 16.
    Hirschmüller H, Scharstein D (2007) Evaluation of cost functions for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8Google Scholar
  17. 17.
    Hosni A, Bleyer M, Gelautz M et al (2009) Local stereo matching using geodesic support weights[C]. In: Proceedings of the 16th IEEE international conference on image processing. IEEE, pp 2093–2096Google Scholar
  18. 18.
    Hosni A, Rhemann C, Bleyer M et al (2013) Fast cost-volume filtering for visual correspondence and beyond[J]. IEEE Trans Pattern Anal Mach Intell 35(2):504–511CrossRefGoogle Scholar
  19. 19.
    Hosni A, Bleyer M, Gelautz M (2013) Secrets of adaptive support weight techniques for local stereo matching [J]. Comput Vis Image Underst 117(6):620–632CrossRefGoogle Scholar
  20. 20.
    Kanade T, Okutomi M (1994) A stereo matching algorithm with an adaptive window: theory and experiment [J]. IEEE Trans Pattern Anal Mach Intell 16(9):920–932CrossRefGoogle Scholar
  21. 21.
    Kao CC (2017) Stereoscopic image generation with depth image based rendering[J]. Multimed Tools Appl 76(11):12981–12999CrossRefGoogle Scholar
  22. 22.
    Kappes JH, Andres B, Hamprecht FA et al (2013) A comparative study of modern inference techniques for discrete energy minimization problems[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1328–1335Google Scholar
  23. 23.
    Kolmogorov V, Zabin R (2004) What energy functions can be minimized via graph cuts?[J]. IEEE Trans Pattern Anal Mach Intell 26(2):147–159CrossRefGoogle Scholar
  24. 24.
    Lan X, Ma AJ, Yuen PC (2014) Multi-cue visual tracking using robust feature-level fusion based on joint sparse representation[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1194–1201Google Scholar
  25. 25.
    Lan X, Ma AJ, Yuen PC et al (2015) Joint sparse representation and robust feature-level fusion for multi-cue visual tracking[J]. IEEE Trans Image Process 24(12):5826–5841MathSciNetCrossRefGoogle Scholar
  26. 26.
    Lan X, Zhang S, Yuen PC (2016) Robust joint discriminative feature learning for visual tracking[C]. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence. AAAI, pp 3403–3410Google Scholar
  27. 27.
    Lan X, Yuen PC, Chellappa R (2017) Robust MIL-based feature template learning for object tracking[C]. In: Proceedings of the thirty-first AAAI conference on artificial intelligence. AAAI, pp 4118–4125Google Scholar
  28. 28.
    Lan X, Ye M, Zhang S et al (2018) Robust collaborative discriminative learning for RGB-infrared tracking[C]. In: Proceedings of the thirty-second AAAI conference on artificial intelligence, vol 7008. AAAI, p 7015Google Scholar
  29. 29.
    Lan X, Zhang S, Yuen PC et al (2018) Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker[J]. IEEE Trans Image Process 27(4):2022–2037MathSciNetCrossRefGoogle Scholar
  30. 30.
    Liu Y, Nie L, Han L et al (2015) Action2Activity: recognizing complex activities from sensor data[C]. In: Proceedings of the twenty-fourth international joint conference on artificial intelligence. AAAI, pp 1617–1623Google Scholar
  31. 31.
    Liu L, Cheng L, Liu Y et al (2016) Recognizing complex activities by a probabilistic interval-based model[C]. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 1266–1272Google Scholar
  32. 32.
    Liu Y, Nie L, Liu L et al (2016) From action to activity: sensor-based activity recognition[J]. Neurocomputing 181:108–115CrossRefGoogle Scholar
  33. 33.
    Liu Y, Zhang L, Nie L et al (2016) Fortune teller: predicting your career path[C]. In: Proceedings of the thirtieth AAAI conference on artificial intelligence. AAAI, pp 201–207Google Scholar
  34. 34.
    Mattoccia S, Giardino S, Gambini A (2009) Accurate and efficient cost aggregation strategy for stereo correspondence based on approximated joint bilateral filtering[C]. In: Proceedings of the Asian conference on computer vision. Springer, pp 371–382Google Scholar
  35. 35.
    Mei X, Sun X, Zhou M et al (2011) On building an accurate stereo matching system on graphics hardware[C]. In: Proceedings of the IEEE international conference on computer vision workshops. IEEE, pp 467–474Google Scholar
  36. 36.
    Mei X, Sun X, Dong W et al (2013) Segment-tree based cost aggregation for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 313–320Google Scholar
  37. 37.
    Milanfar P (2013) A tour of modern image filtering: new insights and methods, both practical and theoretical[J]. IEEE Signal Process Mag 30(1):106–128CrossRefGoogle Scholar
  38. 38.
    Richardt C, Orr D, Davies I et al (2010) Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid[C]. In: Proceedings of the European conference on computer vision. Springer, pp 510–523Google Scholar
  39. 39.
    Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms [J]. Int J Comput Vis 47(1):7–42CrossRefGoogle Scholar
  40. 40.
    Scharstein D, Szeliski R (2002) Middlebury stereo vision website. [Online]. Available: http://vision.middlebury.edu/stereo/data. Accessed March 2018
  41. 41.
    Sengupta S, Greveson E, Shahrokni A et al (2013) Urban 3d semantic modelling using stereovision[C]. In: Proceedings of the IEEE international conference on robotics and automation. IEEE, pp 580–585Google Scholar
  42. 42.
    Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation [J]. IEEE Trans Pattern Anal Mach Intell 25(7):787–800CrossRefGoogle Scholar
  43. 43.
    Tombari F, Mattoccia S, Stefano LD (2007) Segmentation-based adaptive support for accurate stereo correspondence [C]. In: Proceedings of the Pacific-rim symposium on image and video technology. Springer, pp 427–438Google Scholar
  44. 44.
    Tombari F, Mattoccia S, Stefano LD et al (2008) Classification and evaluation of cost aggregation methods for stereo correspondence [C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8Google Scholar
  45. 45.
    Veksler O (2003) Fast variable window for stereo correspondence using integral images[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 556–561Google Scholar
  46. 46.
    Veksler O (2005) Stereo correspondence by dynamic programming on a tree[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2. IEEE, pp 384–390Google Scholar
  47. 47.
    Wang ZF, Zheng ZG (2008) A region based stereo matching algorithm using cooperative optimization[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1–8Google Scholar
  48. 48.
    Wu W, Zhu H, Zhang Q (2018) Epipolar rectification by singular value decomposition of essential matrix[J]. Multimed Tools Appl 77(12):15747–15771CrossRefGoogle Scholar
  49. 49.
    Yamaguchi K, McAllester D, Urtasun R (2014) Efficient joint segmentation, occlusion labeling, stereo and flow estimation[C]. In: Proceedings of European conference on computer vision. Springer, pp 756–771Google Scholar
  50. 50.
    Yang Q (2012) A non-local cost aggregation method for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1402–1409Google Scholar
  51. 51.
    Yang Q (2015) Stereo matching using tree filtering [J]. IEEE Trans Pattern Anal Mach Intell 37(4):834–846CrossRefGoogle Scholar
  52. 52.
    Yang Q, Wang L, Yang R et al (2009) Stereo matching with color-weighted correlation, hierarchical belief propagation, and occlusion handling [J]. IEEE Trans Pattern Anal Mach Intell 31(3):492–504CrossRefGoogle Scholar
  53. 53.
    Yoon K, Kweon I (2006) Adaptive support-weight approach for correspondence search [J]. IEEE Trans Pattern Anal Mach Intell 28(4):650–656CrossRefGoogle Scholar
  54. 54.
    Zabih R, Woodfill J (1994) Non-parametric local transforms for computing visual correspondence[C]. In: Proceedings of the European conference on computer vision. Springer, pp 151–158Google Scholar
  55. 55.
    Zhang K, Lu J, Lafruit G (2009) Cross-based local stereo matching using orthogonal integral images [J]. IEEE Trans Circuits Syst Video Technol 19(7):1073–1079CrossRefGoogle Scholar
  56. 56.
    Zhang K, Fang Y, Min D et al (2014) Cross-scale cost aggregation for stereo matching[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, pp 1590–1597Google Scholar
  57. 57.
    Zhang C, Li Z, Cheng Y et al (2015) Meshstereo: a global stereo model with mesh alignment regularization for view interpolation[C]. In: Proceedings of the IEEE international conference on computer vision. IEEE, pp 2057–2065Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of Automation and Information EngineeringXi’an University of TechnologyXi’anChina
  2. 2.School of Electrical and Information EngineeringHubei University of Automotive TechnologyShiyanChina
  3. 3.Department of Information Science and TechnologyTaishan UniversityTai’anChina

Personalised recommendations