Detecting ground control points via convolutional neural network for stereo matching

Zhong, Zhun; Su, Songzhi; Cao, Donglin; Li, Shaozi; Lv, Zhihan

doi:10.1007/s11042-016-3932-y

Detecting ground control points via convolutional neural network for stereo matching

Published: 22 September 2016

Volume 76, pages 18473–18488, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Zhun Zhong¹,
Songzhi Su¹,
Donglin Cao¹,
Shaozi Li¹ &
…
Zhihan Lv²

890 Accesses
4 Citations
Explore all metrics

Abstract

In this paper, we present a novel approach to detect ground control points (GCPs) for stereo matching problem. First of all, we train a convolutional neural network (CNN) on a large stereo set, and compute the matching confidence of each pixel by using the trained CNN model. Secondly, we present a ground control points selection scheme according to the maximum matching confidence of each pixel. Finally, the selected GCPs are used to refine the matching costs, then we apply the new matching costs to perform optimization with semi-global matching algorithm for improving the final disparity maps. We evaluate our approach on the KITTI 2012 stereo benchmark dataset. Our experiments show that the proposed approach significantly improves the accuracy of disparity maps.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning vs. Traditional Computer Vision

Deep learning-based 3D reconstruction: a survey

Article 28 January 2023

OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing

Article Open access 23 April 2024

References

Arandjelović R, Gronat P, Torii A, Pajdla T, Sivic J (2015) NetVLAD: CNN architecture for weakly supervised place recognition. arXiv:1511.07247
Bobick AF, Intille SS (1999) Large occlusion stereo. IJCV 33(3):181–200
Article Google Scholar
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. TPAMI 23(11):1222–1239
Article Google Scholar
Chen Z, Sun X, Wang L, Yu Y, Huang C (2015) A deep visual correspondence embedding model for stereo matching costs. In: ICCV, pp 972–980
Freeman WT, Pasztor EC, Carmichael OT (2000) Learning low-level vision. IJCV 40(1):25–47
Article MATH Google Scholar
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res:0278364913491297
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision (pp. 1440–1448)
Haeusler R, Nair R, Kondermann D (2013) Ensemble learning for confidence measures in stereo vision. In: CVPR. IEEE, pp 305–312
Hermann S, Klette R (2013) Iterative semi-global matching for robust driver assistance systems. In: ACCV. Springer, pp 465–478
Hirschmüller H (2008) Stereo processing by semiglobal matching and mutual information. TPAMI 30(2):328–341
Article Google Scholar
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Kong D, Tao H (2004) A method for learning matching errors for stereo computation. In: BMVC, vol 1, p 2
Kong D, Tao H (2006) Stereo matching via learning multiple experts behaviors. In: BMVC, vol 1, p 2
Lew MS, Huang TS, Wong K (1994) Learning and feature selection in stereo matching. TPAMI 16(9):869–881
Article Google Scholar
Li W, Chen Y, Lee J, Ren G, Cosker D (2016) Blur robust optical flow using motion channel. arXiv:1603.02253
Li W, Cosker D (2016) Video interpolation using optical flow and laplacian smoothness. Neurocomputing
Li W, Cosker D, Brown M, Tang R (2013) Optical flow estimation using laplacian mesh energy. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2435–2442
Li W, Cosker D, Zhihan L, Brown M (2016) Nonrigid optical flow ground truth for real-world scenes with time-varying shading effects. IEEE robotics and automation letters 2(11):231–238
Article Google Scholar
Liang Z, Zhi B, Yifan S, Jingdong W, Shengjin W, Chi S, Qi T (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer
Motten A, Claesen L, Pan Y (2012) Trinocular disparity processor using a hierarchic classification structure. In: IEEE/IFIP 20th international conference on VLSI And system-on-chip (VLSI-SoC), 2012. IEEE, pp 247–250
Park MG, Yoon KJ (2015) Leveraging stereo matching with learning-based confidence measures. In: CVPR, pp 101–109
Peris M, Maki A, Martull S, Ohkawa Y, Fukui K (2012) Towards a simulation driven stereo vision system. In: ICPR. IEEE, pp 1038–1042
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1-3):7–42
Article MATH Google Scholar
Spangenberg R, Langner T, Rojas R (2013) Weighted semi-global matching and center-symmetric census transform for robust driver assistance. In: Computer analysis of images and patterns. Springer, pp 34–41
Spyropoulos A, Komodakis N, Mordohai P (2014) Learning to detect ground control points for improving the accuracy of stereo matching. In: CVPR. IEEE, pp 1621–1628
Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation. TPAMI 25(7):787– 800
Article MATH Google Scholar
Vedula S, Baker S, Rander P, Collins R, Kanade T (1999) Three-dimensional scene flow. In: The proceedings of the seventh IEEE international conference on computer vision, 1999, vol 2. IEEE, pp 722–729
Yamaguchi K, McAllester D, Urtasun R (2013) Robust monocular epipolar flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1862–1869
Yamaguchi K, McAllester D, Urtasun R (2014) Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: European conference on computer vision. Springer, pp 756–771
Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks. CVPR
žbontar J, LeCun Y (2015) Computing the stereo matching cost with a convolutional neural network. CVPR
Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17:1–32
MATH Google Scholar
Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1741–1750
Zheng L, Zhang H, Sun S et al (2016) Person re-identification in the wild. arXiv:1604.02531
Zhong Z, Lei M, Li S, Fan J (2016) Re-ranking object proposals for object detection in automatic driving. arXiv:1605.05904

Download references

Acknowledgments

We thank Wenjing Li for helpful discussions and encouragement. This work is supported by the Nature Science Foundation of China (No.61202143, No.61572409), the Natural Science Foundation of Fujian Province (No.2013J05100) and Fujian Provi-nce 2011 Collaborative Innovation Center of TCM Health Management.

Author information

Authors and Affiliations

Cognitive Science Department, Xiamen University, Xiamen, 361005, Fujian, China
Zhun Zhong, Songzhi Su, Donglin Cao & Shaozi Li
SIAT, Chinese Academy of Science, Shenzhen, 518055, China
Zhihan Lv

Authors

Zhun Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Songzhi Su
View author publications
You can also search for this author in PubMed Google Scholar
Donglin Cao
View author publications
You can also search for this author in PubMed Google Scholar
Shaozi Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhihan Lv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shaozi Li.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhong, Z., Su, S., Cao, D. et al. Detecting ground control points via convolutional neural network for stereo matching. Multimed Tools Appl 76, 18473–18488 (2017). https://doi.org/10.1007/s11042-016-3932-y

Download citation

Received: 21 May 2016
Revised: 26 July 2016
Accepted: 01 September 2016
Published: 22 September 2016
Issue Date: September 2017
DOI: https://doi.org/10.1007/s11042-016-3932-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Detecting ground control points via convolutional neural network for stereo matching

Abstract

Access this article

Similar content being viewed by others

Deep Learning vs. Traditional Computer Vision

Deep learning-based 3D reconstruction: a survey

OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Detecting ground control points via convolutional neural network for stereo matching

Abstract

Access this article

Similar content being viewed by others

Deep Learning vs. Traditional Computer Vision

Deep learning-based 3D reconstruction: a survey

OmniGlasses: an optical aid for stereo vision CNNs to enable omnidirectional image processing

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation