Efficient and Robust 3D Object Reconstruction Based on Monocular SLAM and CNN Semantic Segmentation

  • Thomas WeberEmail author
  • Sergey TriputenEmail author
  • Atmaraaj Gopal
  • Steffen Eißler
  • Christian Höfert
  • Kristiaan Schreve
  • Matthias Rätsch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11531)


Various applications implement slam technology, especially in the field of robot navigation. We show the advantage of slam technology for independent 3d object reconstruction. To receive a point cloud of every object of interest void of its environment, we leverage deep learning. We utilize recent cnn deep learning research for accurate semantic segmentation of objects. In this work, we propose two fusion methods for cnn-based semantic segmentation and slam for the 3d reconstruction of objects of interest in order to obtain a more robustness and efficiency. As a major novelty, we introduce a cnn-based masking to focus slam only on feature points belonging to every single object. Noisy, complex or even non-rigid features in the background are filtered out, improving the estimation of the camera pose and the 3d point cloud of each object. Our experiments are constrained to the reconstruction of industrial objects. We present an analysis of the accuracy and performance of each method and compare the two methods describing their pros and cons.


3d reconstruction slam lsd-slam Monocular camera cnn Semantic segmentation Bin-picking Collaborative robot Depth estimation 


  1. 1.
    Badrinarayanan, V., Kendall, A., Cipolla, R.: SegNet: a deep convolutional encoder-decoder architecture for image segmentation. CoRR abs/1511.00561 (2015).
  2. 2.
    Besl, P.J., McKay, N.D.: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992). Scholar
  3. 3.
    Cignoni, P., Rocchini, C., Scopigno, R.: Metro: measuring error on simplified surfaces, vol. 17, pp. 167–174, July 1998CrossRefGoogle Scholar
  4. 4.
    Daniel, G.-M.: CloudCompare.
  5. 5.
    Engel, J., Sturm, J., Cremers, D.: Semi-dense visual odometry for a monocular camera. In: 2013 IEEE International Conference on Computer Vision, pp. 1449–1456, December 2013.
  6. 6.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). Scholar
  7. 7.
    Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22, May 2014.
  8. 8.
    Forster, C., Zhang, Z., Gassner, M., Werlberger, M., Scaramuzza, D.: SVO: semidirect visual odometry for monocular and multicamera systems. IEEE Trans. Rob. 33(2), 249–265 (2017). Scholar
  9. 9.
    Engel, J., Sturm, J., Cremers, D.: Camera-based navigation of a low-cost quadrocopter. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2815–2821, October 2012.
  10. 10.
    Jafari, O.H., Groth, O., Kirillov, A., Yang, M.Y., Rother, C.: Analyzing modular CNN architectures for joint depth prediction and semantic segmentation. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4620–4627, May 2017.
  11. 11.
    Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, pp. 225–234, November 2007.
  12. 12.
    Klein, G., Murray, D.: Parallel tracking and mapping on a camera phone. In: 2009 8th IEEE International Symposium on Mixed and Augmented Reality, pp. 83–86, October 2009.
  13. 13.
    Li, C., Yang, Y., Feng, M., Chakradhar, S., Zhou, H.: Optimizing memory efficiency for deep convolutional neural networks on GPUs. In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016, pp. 633–644, November 2016.
  14. 14.
    Lin, G., Milan, A., Shen, C., Reid, I.D.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. CoRR abs/1611.06612 (2016).
  15. 15.
    Lin, T., et al.: Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014).
  16. 16.
    Mur-Artal, R., Tardos, J.: ORB-SLAM: tracking and mapping recognizable features. In: Robotics: Science and Systems (RSS) Workshop on Multi View Geometry in Robotics (MVIGRO), July 2014Google Scholar
  17. 17.
    Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. CoRR abs/1610.06475 (2016).
  18. 18.
    Pillai, S., Leonard, J.J.: Monocular SLAM supported object recognition. CoRR abs/1506.01732 (2015).
  19. 19.
    Pizzoli, M., Forster, C., Scaramuzza, D.: Remode: probabilistic, monocular dense reconstruction in real time. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 2609–2616, May 2014.
  20. 20.
    Strigl, D., Kofler, K., Podlipnig, S.: Performance and scalability of GPU-based convolutional neural networks. In: 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp. 317–324, February 2010.
  21. 21.
    Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3D shape recognition. CoRR abs/1505.00880 (2015).
  22. 22.
    Tateno, K., Tombari, F., Laina, I., Navab, N.: CNN-SLAM: real-time dense monocular SLAM with learned depth prediction. ArXiv e-prints April 2017
  23. 23.
    Triputen, S., Gopal, A., Weber, T., Hofert, C., Schreve, K., Rätsch, M.: Methodology to analyze the accuracy of 3D objects reconstructed with collaborative robot based monocular LSD-SLAM. CoRR abs/1803.02257 (2018).
  24. 24.
    Zhang, Y., et al.: Physically-based rendering for indoor scene understanding using convolutional neural networks. CoRR abs/1612.07429 (2016).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Thomas Weber
    • 1
    Email author
  • Sergey Triputen
    • 1
    Email author
  • Atmaraaj Gopal
    • 1
  • Steffen Eißler
    • 1
  • Christian Höfert
    • 1
  • Kristiaan Schreve
    • 2
  • Matthias Rätsch
    • 1
  1. 1.Reutlingen UniversityReutlingenGermany
  2. 2.University of StellenboschStellenboschSouth Africa

Personalised recommendations