Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model


Three-dimensional object reconstruction from multi-view images is an important topic in computer vision, which has attracted enormous attention during the past decades. With the further study in deep learning, remarkable progress of three-dimensional object reconstruct has been obtained in recent years. In this paper, we proposed three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model in the field of three-dimensional object reconstruction. Firstly, we introduce a matching algorithm called local stereo matching algorithm based on improved census transform and multi-scale spatial, aiming to improve the matching results for those regions. In cost aggregation step, guided map filtering algorithm with excellent gradient preserving property is introduced into Gaussian pyramid structure and regularization is added to strengthen cost volume consistency. Secondly, the improved inception RESNET module is added to improve the feature extraction ability of the network, and multiple features are extracted by using multiple network structures, and finally multiple features are sequentially input into the VRNN module to enhance the reconstruction effect of multi-view images. The experimental results show that our proposed method can not only achieve better reconstruction results, but also reconstruct more details and spend less time in training.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12


  1. 1.

    Diebel J, Thrun S (2005) An application of Markov random fields to range sensing. In: Advances in Neural Information Processing Systems, vol 24, no 05, pp 291–298

  2. 2.

    Zhuand J, Yang R (2010) Spatial–temporal fusion for high accuracy depth maps using dynamic MRFs. IEEE Trans Pattern Anal Mach Intell 32(5):899–909

  3. 3.

    Lu J, Min D, Pahwa RS, Do MN (2011) A review to MRF-based depth map super-resolution and enhancement. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp 985–988

  4. 4.

    Jessop ZM, Al-Sabah A, Gardiner MD, Combellack E, Hawkins K, Whitaker IS (2017) 3D bioprinting for reconstructive surgery: principles, applications and challenges. J Plast Reconstr Aesthet Surg 70(9):1155–1170

  5. 5.

    Zollhöfer M, Thies J, Garrido P, Bradley D, Beeler T, Pérez P, Stamminger M, Nießner M, Theobalt C (2018) State of the art on monocular 3D face reconstruction, tracking, and applications. Comput Graph Forum 37(2):523–550

  6. 6.

    Carr R (2013) Coachella picks RBF for 2,200-Acre La Entrada’s infrastructure. Natl Real Estate Invest Exclus Insight 2:67–68

  7. 7.

    Penczek P, Radermacher M, Frank J (1992) Three-dimensional reconstruction of single particles embedded in ice. Ultramicroscopy 40(1):33–53

  8. 8.

    Gilbert P (1972) Iterative methods for the three-dimensional reconstruction of an object from projections. J Theor Biol 36(1):105–117

  9. 9.

    Amenta N, Choi S, Kolluri RK (2001) The power crust, unions of balls, and the medial axis transform. Comput Geom Theory Appl 19(2–3):127–153

  10. 10.

    Qian P, Jiang Y, Deng Z, Hu L, Sun S, Wang S, Muzic RF (2015) Cluster prototypes and fuzzy memberships jointly leveraged cross-domain maximum entropy clustering. IEEE Trans Cybern 46(1):181–193

  11. 11.

    Qian P, Jiang Y, Wang S, Su KH, Wang J, Hu L, Muzic RF (2016) Affinity and penalty jointly constrained spectral clustering with all-compatibility, flexibility, and robustness. IEEE Trans Neural Netw Learn Syst 28(5):1123–1138

  12. 12.

    Qian P, Zhao K, Jiang Y, Su KH, Deng Z, Wang S, Muzic RF (2017) Knowledge-leveraged transfer fuzzy C-means for texture image segmentation with self-adaptive cluster prototype matching. Knowl Based Syst 130:33–50

  13. 13.

    Qian P, Xi C, Xu M, Jiang Y, Su KH, Wang S, Muzic RF (2018) SSC-EKE: semi-supervised classification with extensive knowledge exploitation. Inf Sci 422:51–76

  14. 14.

    Qian P, Sun S, Jiang Y, Su KH, Ni T, Wang S, Muzic RF (2016) Cross-domain, soft-partition clustering with diversity measure and knowledge reference. Pattern Recognit 50:155–177

  15. 15.

    Qian P, Zhou J, Jiang Y, Liang F, Zhao K, Wang S, Su KH, Muzic RF (2018) Multi-view maximum entropy clustering by jointly leveraging inter-view collaborations and intra-view-weighted attributes. IEEE Access 6:28594–28610

  16. 16.

    Qian P, Chung FL, Wang S, Deng Z (2012) Fast graph-based relaxed clustering for large data sets using minimal enclosing ball. IEEE Trans Syst Man Cybern Part B (Cybern) 42(3):672–687

  17. 17.

    Jiang Y, Wu D, Deng Z, Qian P, Wang J, Wang G, Chung FL, Choi KS, Wang S (2017) Seizure classification from EEG signals using transfer learning, semi-supervised learning and TSK fuzzy system. IEEE Trans Neural Syst Rehabil Eng 25(12):2270–2284

  18. 18.

    Jiang Y, Deng Z, Chung FL, Wang G, Qian P, Choi KS, Wang S (2017) Recognition of epileptic EEG signals using a novel multiview TSK fuzzy system. IEEE Trans Fuzzy Syst 25(1):3–20

  19. 19.

    Jiang Y, Chung FL, Wang S, Deng Z, Wang J, Qian P (2014) Collaborative fuzzy clustering from multiple weighted views. IEEE Trans Cybern 45(4):688–701

  20. 20.

    Jiang Y, Chung FL, Ishibuchi H, Deng Z, Wang S (2015) Multitask TSK fuzzy system modeling by mining intertask common hidden structure. IEEE Trans Cybern 45(3):534–547

  21. 21.

    Murphy KP, Weiss Y, Jordan MI (1999) Loopy belief propagation for approximate inference: an empirical study. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, vol 12, no 9. Morgan Kaufmann Publishers Inc., Burlington, pp 467–475

  22. 22.

    Park J, Kim H, Tai Y-W, Brown MS, Kweo I (2010) High quality depth map upsampling. In: IEEE International Conference on Computer Vision (ICCV), pp 1623–1630

  23. 23.

    Yang Q, Yang R, Davis J, Nister D (2007) Spatial-depth super resolution for range images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 1–8

  24. 24.

    Chan D, Buisman H et al (2008) A noise-aware filter for real-time depth upsampling. In: Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications, pp 209–219

  25. 25.

    Furukawa Y (2010) Accurate, dense, and robust multiview stereopsis. IEEE Trans Pattern Anal Mach Intell 32(8):1362–1376

  26. 26.

    Dolson J, Baek J, Plagemann C, Thrun S (2010) Upsampling range data in dynamic environments. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1141–1148

  27. 27.

    Foix S, Alenya G, Torras C (2011) Lock-in time-of-flight (TOF) cameras: a survey. Sens J IEEE 11(9):1917–1926

  28. 28.

    Harrison A, Newman P (2010) Image and sparse laser fusion for dense scene reconstruction. In: Howard A, Iagnemma K, Kelly A (eds) Field and service robotics. Springer, Berlin, pp 219–228

  29. 29.

    Li N, Gong X, Li H et al (2018) Nonuniform multiview color texture mapping of image sequence and three-dimensional model for faded cultural relics with sift feature points. J Electron Imaging 27(1):1–21

  30. 30.

    Du C, Du C, Huang L et al (2018) Reconstructing perceived images from human brain activities with Bayesian deep multiview learning. IEEE Trans Neural Netw Learn Syst 24(24):1–14

  31. 31.

    Yao Y, Luo Z, Li S (2018) MVSNet: depth inference for unstructured multi-view stereo. In: European Conference on Computer Vision

  32. 32.

    Schoenberg JR, Nathan A, Campbell (2012) Segmentation of dense range information in complex urban scenes. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol 21, no 4, pp 2033–2038

  33. 33.

    Vincent L, Jean Sébastien F, Edmond B (2018) Shape reconstruction using volume sweeping and learned photo consistency. In: European Conference on Computer Vision

  34. 34.

    Riegler G, Ulusoy AO, Geiger A (2017) OctNet: learning deep 3D representations at high resolutions. In: IEEE Conference on Computer Vision and Pattern Recognition

  35. 35.

    Riegler G, Ulusoy AO, Bischof H (2017) OctNet fusion: learning depth fusion from data. In: International Conference on 3D Vision

  36. 36.

    Ji M, Gall J, Zheng H (2017) SurfaceNet: an end-to-end 3D neural network for multiview stereopsis. In: IEEE International Conference on Computer Vision

  37. 37.

    Zou C, Yumer E, Yang J et al (2017) 3D-PRNN: generating shape primitives with recurrent neural networks, vol 23 no 21, pp 993–1000

  38. 38.

    Gringarten E, Deutsch CV (2001) Teacher’s aide variogram inter-pretation and modeling. Math Geol 33(4):507–534

  39. 39.

    Rebecq H, Gallego G, Mueggler E et al (2017) EMVS: event-based multi-view stereo—3D reconstruction with an event camera in real-time. Int J Comput Vis 33(31):980–992

  40. 40.

    Haopeng Z, Quanmao W, Zhiguo J (2017) 3D reconstruction of space objects from multi-views by a visible sensor. Sensors 17(7):1689–1698

  41. 41.

    Laviada J, Arboleyaarboleya A, Álvarez Y et al (2017) Multiview three-dimensional reconstruction by millimetre-wave portable camera. Sci Rep 7(1):64–79

  42. 42.

    Ebner T, Feldmann I, Renault S et al (2017) Multi-view reconstruction of dynamic real-world objects and their integration in augmented and virtual reality applications. J Soc Inf Disp 25(3):151–157

  43. 43.

    Wang Q, Lv H, Yue J et al (2016) Supervised multiview learning based on simultaneous learning of multiview intact and single view classifier. Neural Comput Appl 16(4):61–73

  44. 44.

    Sun L, Chen K, Song M et al (2017) Robust, efficient depth reconstruction with hierarchical confidence-based matching. IEEE Trans Image Process 26(7):3331–3343

  45. 45.

    Huang L, Chao HY, Wang CD (2018) Multi-view intact space clustering. Pattern Recognit 26(7):31–43

  46. 46.

    Wiles O, Zisserman A (2017) SilNet: single- and multi-view reconstruction by learning from silhouettes. arXiv preprint arXiv:1711.07888

  47. 47.

    Yin Z, Zheng Y, Doerschuk PC (2001) An ab initio algorithm for low-resolution 3-D reconstructions from cryoelectron microscopy images. J Struct Biol 133(2–3):130–142

  48. 48.

    Yu L, Fan X, Fa Z et al (2018) DLBI: deep learning guided Bayesian inference for structure reconstruction of super-resolution fluorescence microscopy. Bioinformatics 34(13):284–294

  49. 49.

    Tang Z, Wang S, Huo J et al (2016) Bayesian framework with non-local and low-rank constraint for image reconstruction. J Phys Conf Ser 787–797:012008

  50. 50.

    Michelangelo C, Gianvito P, Vladimir K et al (2015) Semi-supervised multi-view learning for gene network reconstruction. PLoS ONE 10(12):31–45

  51. 51.

    Han X, Gao C, Yu Y (2017) DeepSketch2Face: a deep learning based sketching system for 3D face and caricature modeling. ACM Trans Graph 36(4):1–12

  52. 52.

    Vodrahalli K, Bhowmik AK (2017) 3D computer vision based on machine learning with deep neural networks: a review. J Soc Inf Disp 25(11):098–103

  53. 53.

    Raphael P, Mehrdad S, Simon J et al (2018) 3D freehand ultrasound without external tracking using deep learning. Med Image Anal 48:187–202

  54. 54.

    Zhang J, Li K, Liang Y et al (2017) Learning 3D faces from 2D images via stacked contractive autoencoder. Neurocomputing 257:67–78

  55. 55.

    Bai S, Zhou Z, Wang J et al (2018) Automatic ensemble diffusion for 3D shape and image retrieval. IEEE Trans Image Process 28(1): 88–101

  56. 56.

    Hansen MF, Smith ML, Smith LN et al (2018) Automated monitoring of dairy cow body condition, mobility and weight using a single 3D video capture device. Comput Ind 98:14–22

  57. 57.

    Peng XB, Berseth G, Yin K et al (2017) DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans Graph 36(4):1–13

  58. 58.

    He GW, Wang TY, Chiang AS et al (2017) Soma detection in 3D images of neurons using machine learning technique. Neuroinformatics 65(45):2081–2099

  59. 59.

    Zhou W, Yu L, Zhou Y et al (2017) Blind quality estimator for 3D images based on binocular combination and extreme learning machine. Pattern Recognit 71:207–217

Download references


This work was financially supported by Major project of philosophy and social science research in colleges and universities of Jiangsu province (2018SJZDA015); research foundation project of Nanjing Institute of Technology (YKJ201619); major project of philosophy and social science research in colleges and universities of 2019 Jiangsu province (2019SJZDA118).

Author information

The authors equally contributed to this research and the paper initiated by the first author. All authors read and approved the final manuscript.

Correspondence to Lijun Xu.

Ethics declarations

Conflict of interest

We declare that there are no competing interests.

Availability of data and materials

Data will not be shared as the authors do not have permission to share data from the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yan, S., Xu, L. & Wang, S. Three-dimensional rapid registration and reconstruction of multi-view rigid objects based on end-to-end deep surface model. J Supercomput (2020).

Download citation


  • Three-dimensional reconstruction
  • Deep surface model
  • Multi-view
  • Rigid objects
  • Local stereo matching
  • Inception RESNET