RobotFusion: Grasping with a Robotic Manipulator via Multi-view Reconstruction

  • Daniele De GregorioEmail author
  • Federico Tombari
  • Luigi Di Stefano
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9915)


We propose a complete system for 3D object reconstruction and grasping based on an articulated robotic manipulator. We deploy an RGB-D sensor as an end effector placed directly on the robotic arm, and process the acquired data to perform multi-view 3D reconstruction and object grasping. We leverage the high repeatability of the robotic arm to estimate 3D camera poses with millimeter accuracy and control each of the six sensor’s DOF in a dexterous workspace. Thereby, we can estimate camera poses directly by robot kinematics and deploy a Truncated Signed Distance Function (TSDF) to accurately fuse multiple views into a unified 3D reconstruction of the scene. Then, we propose an efficient approach to segment the sought objects out of a planar workbench as well as a novel algorithm to automatically estimate grasping points.


Grasp Manipulation Reconstruction 

Supplementary material

Supplementary material 1 (mp4 2678 KB)


  1. 1.
    Aldoma, A., Tombari, F., Prankl, J., Richtsfeld, A., Stefano, L.D., Vincze, M.: Multimodal cue integration through hypotheses verification for RGB-D object recognition and 6DOF pose estimation. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 2104–2111, May 2013Google Scholar
  2. 2.
    Aldoma, A., Vincze, M., Blodow, N., Gossow, D., Gedikli, S., Rusu, R.B., Bradski, G.: CAD-model recognition and 6DOF pose estimation using 3d cues. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 585–592, November 2011Google Scholar
  3. 3.
    Bicchi, A., Kumar, V.: Robotic grasping and contact: a review. In: 2000 Proceedings of IEEE International Conference on Robotics and Automation, ICRA 2000, vol. 1, pp. 348–353 (2000)Google Scholar
  4. 4.
    Bylow, E., Sturm, J., Kerl, C., Kahl, F., Cremers, D.: Real-time camera tracking and 3d reconstruction using signed distance functions. In: Robotics: Science and Systems (RSS), Online Proceedings (2013)Google Scholar
  5. 5.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 834–849. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10605-2_54 Google Scholar
  6. 6.
    Ferrari, C., Canny, J.: Planning optimal grasps. In: Proceedings of 1992 IEEE International Conference on Robotics and Automation, vol. 3, pp. 2290–2295, May 1992Google Scholar
  7. 7.
    Fulhammer, T., Aldoma, A., Zillich, M., Vincze, M.: Temporal integration of feature correspondences for enhanced recognition in cluttered and dynamic environments. In: International Conferenec on Robotics and Automation (ICRA), pp. 3003–3009, May 2015Google Scholar
  8. 8.
    Hang, K., Stork, J., Kragic, D.: Hierarchical fingertip space for multi-fingered precision grasping. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), pp. 1641–1648, September 2014Google Scholar
  9. 9.
    ISO TC 184SC 2 Robots, robotic devices: ISO 9283. Manipulating industrial robots - Performance criteria and related test methods. International Organization for Standardization, Geneva, Switzerland (2015).
  10. 10.
    Kehl, W., Navab, N., Ilic, S.: Coloured signed distance fields for full 3d object reconstruction. In: Proceedings of the British Machine Vision Conference. BMVA Press (2014)Google Scholar
  11. 11.
    Montana, D.: The condition for contact grasp stability. In: IEEE International Conference on Robotics and Automation (1991)Google Scholar
  12. 12.
    Newcombe, R.A., Davison, A.J., Izadi, S., Kohli, P., Hilliges, O., Shotton, J., Molyneaux, D., Hodges, S., Kim, D., Fitzgibbon, A.: KinectFusion: real-time dense surface mapping and tracking. In: 10th IEEE International Symposium on Mixed and Augmented Reality, pp. 127–136, October 2011.
  13. 13.
    Nguyen, C., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT), pp. 524–530, October 2012Google Scholar
  14. 14.
    Papazov, H., Parusel, K., Krieger, B.: Rigid 3d geometry matching for grasping of known objects in cluttered scenes. Int. J. Robot. Res. 31, 538–553 (2012)CrossRefGoogle Scholar
  15. 15.
    Ponce, J., Faverjon, B.: On computing three-finger force-closure grasps of polygonal objects. In: Fifth International Conference on Advanced Robotics, Robots in Unstructured Environments, ICAR 1991, vol. 2, pp. 1018–1023, June 1991Google Scholar
  16. 16.
    Popovi, M., Kraft, D., Bodenhagen, L., Baeski, E., Pugeault, N., Kragic, D., Asfour, T., Krger, N.: A strategy for grasping unknown objects based on co-planarity and colour information. Robot. Auton. Syst. 58(5), 551–565 (2010)CrossRefGoogle Scholar
  17. 17.
    Raguram, R., Frahm, J.-M., Pollefeys, M.: A comparative analysis of RANSAC techniques leading to adaptive real-time random sample consensus. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5303, pp. 500–513. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88688-4_37 CrossRefGoogle Scholar
  18. 18.
    Garrido-Jurado, S., Muoz-Salinas, R., Madrid-Cuevas, F.J., Marn-Jimnez, M.J.: Automatic generation and detection of highly reliable fiducial markers under occlusion. Pattern Recogn. 47(6), 2280–2292 (2014)CrossRefGoogle Scholar
  19. 19.
    Salti, S., Tombari, F., Stefano, L.D.: Shot: unique signatures of histograms for surface and texture description. Comput., Vis. Image Underst. 125, 251–264 (2014). CrossRefGoogle Scholar
  20. 20.
    Saut, J.P., Sidobre, D.: Efficient models for grasp planning with a multi-fingered hand. Robot. Autonom. Syst. 60(3), 347–357 (2012)., Autonomous GraspingCrossRefGoogle Scholar
  21. 21.
    Steinbrucker, F., Kerl, C., Cremers, D.: Large-scale multi-resolution surface reconstruction from RGB-D sequences. In: The IEEE International Conference on Computer Vision (ICCV), December 2013Google Scholar
  22. 22.
    Tombari, F., Fioraio, N., Cavallari, T., Salti, S., Petrelli, A., Stefano, L.D.: Automatic detection of pole-like structures in 3d urban environments. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), pp. 4922–4929, September 2014Google Scholar
  23. 23.
    Whelan, T., Kaess, M., Johannsson, H., Fallon, M., Leonard, J.J., Mcdonald, J.: Real-time large scale dense RGB-D SLAM with volumetric fusion. Int. J. Robot. Res. IJRR 34, 598–626 (2014)CrossRefGoogle Scholar
  24. 24.
    Xie, Z., Singh, A., Uang, J., Narayan, K.S., Abbeel, P.: Multimodal blending for high-accuracy instance recognition. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2214–2221, November 2013Google Scholar
  25. 25.
    Jiang, Y., Moseson, S., Saxena, A.: Efficient grasping from RGBD images: learning using a new rectangle representation. In: 2011 IEEE International Conference on Robotics and Automation (ICRA) (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Daniele De Gregorio
    • 1
    Email author
  • Federico Tombari
    • 1
  • Luigi Di Stefano
    • 1
  1. 1.DISI, University of BolognaBolognaItaly

Personalised recommendations