Towards Semantic KinectFusion

  • Nicola Fioraio
  • Gregorio Cerri
  • Luigi Di Stefano
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8157)


In this paper we propose an extension to the KinectFusion approach which enables both SLAM-graph optimization, usually required on large looping routes, as well as discovery of semantic information in the form of object detection and localization. Global optimization is achieved by incorporating the notion of keyframe into a KinectFusion-style approach, thus providing the system with the ability to explore large environments and maintain a globally consistent map. Moreover, we integrate into the system our recent object detection approach based on a new Semantic Bundle Adjustment paradigm, thereby achieving joint detection, tracking and mapping. Although our current implementation is not optimized for real-time operation, the principles and ideas set forth in this paper can be considered a relevant contribution towards a Semantic KinectFusion system.


KinectFusion semantic SLAM semantic bundle adjustment object detection 


  1. 1.
    Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 9(5), 698–700 (1987)CrossRefGoogle Scholar
  2. 2.
    Bao, S.Y., Savarese, S.: Semantic structure from motion. In: CVPR (2011)Google Scholar
  3. 3.
    Besl, P.J., McKay, H.D.: A method for registration of 3-d shapes. PAMI 14(2), 239–256 (1992)CrossRefGoogle Scholar
  4. 4.
    Chen, Y., Medioni, G.: Object modelling by registration of multiple range images. In: Proc. of the IEEE Int’l Conf. on Robotics and Automation, vol. 3, pp. 2724–2729 (April 1991)Google Scholar
  5. 5.
    Civera, J., Gálvez-López, D., Riazuelo, L., Tardós, J.D., Montiel, J.M.M.: Towards semantic SLAM using a monocular camera. In: Proc. of the Int’l Conf. on Intelligent Robot Systems (IROS), pp. 1277–1284 (2011)Google Scholar
  6. 6.
    Curless, B., Levoy, M.: A volumetric method for building complex models from range images. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1996, pp. 303–312. ACM, New York (1996)CrossRefGoogle Scholar
  7. 7.
    Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: IEEE Int’l Conf. on Computer Vision (ICCV), Washington, DC, USA, p. 1403 (2003)Google Scholar
  8. 8.
    Ekvall, S., Jensfelt, P., Kragic, D.: Integrating active mobile robot object recognition and slam in natural environments. In: IEEE/RSJ Int’l Conf. on Intelligent Robots and Systems (October 2006)Google Scholar
  9. 9.
    Endres, F., Hess, J., Engelhard, N., Sturm, J., Cremers, D., Burgard, W.: An evaluation of the RGB-D SLAM system. In: IEEE Int’l Conf. on Robotics and Automation (ICRA), St. Paul, MA, USA (May 2012)Google Scholar
  10. 10.
    Fioraio, N., Di Stefano, L.: Joint detection, tracking and mapping by semantic bundle adjustment. In: IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Portland, OR, USA (2013)Google Scholar
  11. 11.
    Fioraio, N., Konolige, K.: Realtime visual and point cloud slam. In: Proc. of the RGB-D Workshop on Advanced Reasoning with Depth Cameras at Robotics: Science and Systems Conf. (RSS), pp. 27 (2011)Google Scholar
  12. 12.
    Henry, P., Krainin, M., Herbst, E., Ren, X., Fox, D.: Rgb-d mapping: Using depth cameras for dense 3d modeling of indoor environments. In: Proc. of Int’l Symp. on Experimental Robotics (ISER) (2010)Google Scholar
  13. 13.
    Johnson, A.: Spin-Images: A Representation for 3-D Surface Matching. Ph.D. thesis, Robotics Institute, Carnegie Mellon University (August 1997)Google Scholar
  14. 14.
    Klein, G., Murray, D.: Parallel tracking and mapping for small ar workspaces. In: IEEE and ACM Int’l Symp. on Mixed and Augmented Reality (ISMAR), pp. 225–234 (November 2007)Google Scholar
  15. 15.
    Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: A general framework for graph optimization. In: ICRA, Shanghai, China (May 2011)Google Scholar
  16. 16.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–119 (2004)CrossRefGoogle Scholar
  17. 17.
    Newcombe, R., Lovegrove, S., Davison, A.: Dtam: Dense tracking and mapping in real-time. In: IEEE Int’l Conf. on Computer Vision (ICCV), pp. 2320–2327 (November 2011)Google Scholar
  18. 18.
    Newcombe, R.A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A.J., Kohli, P., Shotton, J., Hodges, S., Fitzgibbon, A.: Kinectfusion: Real-time dense surface mapping and tracking. In: ISMAR, Washington, DC, USA, pp. 127–136 (2011)Google Scholar
  19. 19.
    Rusinkiewicz, S., Levoy, M.: Efficient variants of the ICP algorithm. In: Proceedings Third International Conference on 3-D Digital Imaging and Modeling, pp. 145–152. IEEE Comput. Soc. (2001)Google Scholar
  20. 20.
    Rusu, R.B., Cousins, S.: 3D is here: Point cloud library (PCL). In: IEEE Int’l Conf. on Robotics and Automation (ICRA), Shanghai, China, May 9-13 (2011)Google Scholar
  21. 21.
    Sibley, G., Mei, C., Reid, I., Newman, P.: Adaptive relative bundle adjustment. In: Robotics Science and Systems (RSS), Seattle, USA (June 2009)Google Scholar
  22. 22.
    Stühmer, J., Gumhold, S., Cremers, D.: Real-time dense geometry from a handheld camera. In: Goesele, M., Roth, S., Kuijper, A., Schiele, B., Schindler, K. (eds.) DAGM 2010. LNCS, vol. 6376, pp. 11–20. Springer, Heidelberg (2010)Google Scholar
  23. 23.
    Tombari, F., Salti, S., Di Stefano, L.: A combined texture-shape descriptor for enhanced 3D feature matching. In: 18th IEEE Int’l Conf. on Image Processing (ICIP), September 11-14, pp. 809–812. Brussels, Belgium (2011)Google Scholar
  24. 24.
    Whelan, T., McDonald, J., Kaess, M., Fallon, M., Johannsson, H., Leonard, J.: Kintinuous: Spatially extended Kinect Fusion. In: RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras, Sydney, Australia (July 2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Nicola Fioraio
    • 1
  • Gregorio Cerri
    • 1
  • Luigi Di Stefano
    • 1
  1. 1.Dept. of Computer Science and EngineeringUniversity of BolognaBolognaItaly

Personalised recommendations