Abstract
In this paper, we present a novel approach to detect ground control points (GCPs) for stereo matching problem. First of all, we train a convolutional neural network (CNN) on a large stereo set, and compute the matching confidence of each pixel by using the trained CNN model. Secondly, we present a ground control points selection scheme according to the maximum matching confidence of each pixel. Finally, the selected GCPs are used to refine the matching costs, then we apply the new matching costs to perform optimization with semi-global matching algorithm for improving the final disparity maps. We evaluate our approach on the KITTI 2012 stereo benchmark dataset. Our experiments show that the proposed approach significantly improves the accuracy of disparity maps.
Similar content being viewed by others
References
Arandjelović R, Gronat P, Torii A, Pajdla T, Sivic J (2015) NetVLAD: CNN architecture for weakly supervised place recognition. arXiv:1511.07247
Bobick AF, Intille SS (1999) Large occlusion stereo. IJCV 33(3):181–200
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. TPAMI 23(11):1222–1239
Chen Z, Sun X, Wang L, Yu Y, Huang C (2015) A deep visual correspondence embedding model for stereo matching costs. In: ICCV, pp 972–980
Freeman WT, Pasztor EC, Carmichael OT (2000) Learning low-level vision. IJCV 40(1):25–47
Geiger A, Lenz P, Stiller C, Urtasun R (2013) Vision meets robotics: the kitti dataset. Int J Robot Res:0278364913491297
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision (pp. 1440–1448)
Haeusler R, Nair R, Kondermann D (2013) Ensemble learning for confidence measures in stereo vision. In: CVPR. IEEE, pp 305–312
Hermann S, Klette R (2013) Iterative semi-global matching for robust driver assistance systems. In: ACCV. Springer, pp 465–478
Hirschmüller H (2008) Stereo processing by semiglobal matching and mutual information. TPAMI 30(2):328–341
Jia Y, Shelhamer E, Donahue J, Karayev S, Long J, Girshick R, Guadarrama S, Darrell T (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia. ACM, pp 675–678
Kong D, Tao H (2004) A method for learning matching errors for stereo computation. In: BMVC, vol 1, p 2
Kong D, Tao H (2006) Stereo matching via learning multiple experts behaviors. In: BMVC, vol 1, p 2
Lew MS, Huang TS, Wong K (1994) Learning and feature selection in stereo matching. TPAMI 16(9):869–881
Li W, Chen Y, Lee J, Ren G, Cosker D (2016) Blur robust optical flow using motion channel. arXiv:1603.02253
Li W, Cosker D (2016) Video interpolation using optical flow and laplacian smoothness. Neurocomputing
Li W, Cosker D, Brown M, Tang R (2013) Optical flow estimation using laplacian mesh energy. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2435–2442
Li W, Cosker D, Zhihan L, Brown M (2016) Nonrigid optical flow ground truth for real-world scenes with time-varying shading effects. IEEE robotics and automation letters 2(11):231–238
Liang Z, Zhi B, Yifan S, Jingdong W, Shengjin W, Chi S, Qi T (2016) Mars: a video benchmark for large-scale person re-identification. In: European conference on computer vision. Springer
Motten A, Claesen L, Pan Y (2012) Trinocular disparity processor using a hierarchic classification structure. In: IEEE/IFIP 20th international conference on VLSI And system-on-chip (VLSI-SoC), 2012. IEEE, pp 247–250
Park MG, Yoon KJ (2015) Leveraging stereo matching with learning-based confidence measures. In: CVPR, pp 101–109
Peris M, Maki A, Martull S, Ohkawa Y, Fukui K (2012) Towards a simulation driven stereo vision system. In: ICPR. IEEE, pp 1038–1042
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 47(1-3):7–42
Spangenberg R, Langner T, Rojas R (2013) Weighted semi-global matching and center-symmetric census transform for robust driver assistance. In: Computer analysis of images and patterns. Springer, pp 34–41
Spyropoulos A, Komodakis N, Mordohai P (2014) Learning to detect ground control points for improving the accuracy of stereo matching. In: CVPR. IEEE, pp 1621–1628
Sun J, Zheng NN, Shum HY (2003) Stereo matching using belief propagation. TPAMI 25(7):787– 800
Vedula S, Baker S, Rander P, Collins R, Kanade T (1999) Three-dimensional scene flow. In: The proceedings of the seventh IEEE international conference on computer vision, 1999, vol 2. IEEE, pp 722–729
Yamaguchi K, McAllester D, Urtasun R (2013) Robust monocular epipolar flow estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1862–1869
Yamaguchi K, McAllester D, Urtasun R (2014) Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: European conference on computer vision. Springer, pp 756–771
Zagoruyko S, Komodakis N (2015) Learning to compare image patches via convolutional neural networks. CVPR
žbontar J, LeCun Y (2015) Computing the stereo matching cost with a convolutional neural network. CVPR
Zbontar J, LeCun Y (2016) Stereo matching by training a convolutional neural network to compare image patches. J Mach Learn Res 17:1–32
Zheng L, Wang S, Tian L, He F, Liu Z, Tian Q (2015) Query-adaptive late fusion for image search and person re-identification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1741–1750
Zheng L, Zhang H, Sun S et al (2016) Person re-identification in the wild. arXiv:1604.02531
Zhong Z, Lei M, Li S, Fan J (2016) Re-ranking object proposals for object detection in automatic driving. arXiv:1605.05904
Acknowledgments
We thank Wenjing Li for helpful discussions and encouragement. This work is supported by the Nature Science Foundation of China (No.61202143, No.61572409), the Natural Science Foundation of Fujian Province (No.2013J05100) and Fujian Provi-nce 2011 Collaborative Innovation Center of TCM Health Management.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhong, Z., Su, S., Cao, D. et al. Detecting ground control points via convolutional neural network for stereo matching. Multimed Tools Appl 76, 18473–18488 (2017). https://doi.org/10.1007/s11042-016-3932-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3932-y