RGB-D SLAM Based Incremental Cuboid Modeling

  • Masashi MishimaEmail author
  • Hideaki Uchiyama
  • Diego Thomas
  • Rin-ichiro Taniguchi
  • Rafael Roberto
  • João Paulo Lima
  • Veronica Teichrieb
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11129)


This paper present a framework for incremental 3D cuboid modeling combined with RGB-D SLAM. While performing RGB-D SLAM, planes are incrementally reconstructed from point clouds. Then, cuboids are detected in the planes by analyzing the positional relationships between the planes; orthogonality, convexity, and proximity. Finally, the position, pose and size of a cuboid are determined by computing the intersection of three perpendicular planes. In addition, the cuboid shapes are incrementally updated to suppress false detections with sequential measurements. As an application of our framework, an augmented reality based interactive cuboid modeling system is introduced. In the evaluation at a cluttered environment, the precision and recall of the cuboid detection are improved with our framework owing to stable plane detection, compared with a batch based method.


Geometric shape Cuboid Incrementally structural modeling Point cloud 



This work is supported by JSPS KAKENHI Grant Number JP17H01768.

Supplementary material

Supplementary material 1 (mp4 49230 KB)


  1. 1.
    Del Pero, L., Guan, J., Brau, E., Schlecht, J., Barnard, K.: Sampling bedrooms. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2009–2016. IEEE (2011)Google Scholar
  2. 2.
    Du, H., et al.: Interactive 3D modeling of indoor environments with a consumer depth camera. In: Proceedings of the 13th International Conference on Ubiquitous Computing, pp. 75–84. ACM (2011)Google Scholar
  3. 3.
    Dwibedi, D., Malisiewicz, T., Badrinarayanan, V., Rabinovich, A.: Deep cuboid detection: beyond 2D bounding boxes. arXiv preprint arXiv:1611.10010 (2016)
  4. 4.
    Goldman, R.: Intersection of three planes. In: Graphics Gems, p. 305. Academic Press Professional, Inc. (1990)Google Scholar
  5. 5.
    Hashemifar, Z.S., Lee, K.W., Napp, N., Dantu, K.: Consistent cuboid detection for semantic mapping. In: 2017 IEEE 11th International Conference on Semantic Computing (ICSC), pp. 526–531. IEEE (2017)Google Scholar
  6. 6.
    Hedau, V., Hoiem, D., Forsyth, D.: Recovering the spatial layout of cluttered rooms. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 1849–1856. IEEE (2009)Google Scholar
  7. 7.
    Hedau, V., Hoiem, D., Forsyth, D.: Thinking inside the box: using appearance models and context based on room geometry. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6316, pp. 224–237. Springer, Heidelberg (2010). Scholar
  8. 8.
    Hejrati, M., Ramanan, D.: Categorizing cubes: revisiting pose normalization. In: 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2016)Google Scholar
  9. 9.
    Jiang, H., Xiao, J.: A linear approach to matching cuboids in RGBD images. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2171–2178. IEEE (2013)Google Scholar
  10. 10.
    Khan, S.H., He, X., Bannamoun, M., Sohel, F., Togneri, R.: Separating objects and clutter in indoor scenes. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4603–4611 (2015)Google Scholar
  11. 11.
    Kim, Y.M., Mitra, N.J., Huang, Q., Guibas, L.: Guided real-time scanning of indoor objects. In: Computer Graphics Forum, vol. 32, pp. 177–186. Wiley Online Library (2013)Google Scholar
  12. 12.
    Labbé, M., Michaud, F.: Online global loop closure detection for large-scale multi-session graph-based SLAM. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2014), pp. 2661–2666. IEEE (2014)Google Scholar
  13. 13.
    Lin, D., Fidler, S., Urtasun, R.: Holistic scene understanding for 3D object detection with RGBD cameras. In: 2013 IEEE International Conference on Computer Vision (ICCV), pp. 1417–1424. IEEE (2013)Google Scholar
  14. 14.
    Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011)Google Scholar
  15. 15.
    Nguatem, W., Drauschke, M., Mayer, H.: Finding cuboid-based building models in point clouds. ISPRS-Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. XXXIX-B, 149–154 (2012). Scholar
  16. 16.
    Nguyen, T., Reitmayr, G., Schmalstieg, D.: Structural modeling from depth images. IEEE Trans. Vis. Comput. Graph. 21(11), 1230–1240 (2015)CrossRefGoogle Scholar
  17. 17.
    Olivier, N., et al.: Live structural modeling using RGB-D SLAM. In: ICRA, pp. 6352–6358 (2018)Google Scholar
  18. 18.
    Osada, R., Funkhouser, T., Chazelle, B., Dobkin, D.: Shape distributions. ACM Trans. Graph. (TOG) 21(4), 807–832 (2002)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Ren, Z., Sudderth, E.B.: Three-dimensional object detection and layout prediction using clouds of oriented gradients. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1525–1533 (2016)Google Scholar
  20. 20.
    Rusu, R.B., Marton, Z.C., Blodow, N., Dolha, M., Beetz, M.: Towards 3D point cloud based object maps for household environments. Robot. Auton. Syst. 56(11), 927–941 (2008)CrossRefGoogle Scholar
  21. 21.
    Saxena, A., Driemeyer, J., Ng, A.Y.: Robotic grasping of novel objects using vision. Int. J. Robot. Res. 27(2), 157–173 (2008)CrossRefGoogle Scholar
  22. 22.
    Schnabel, R., Wahl, R., Klein, R.: Efficient RANSAC for point-cloud shape detection. In: Computer Graphics Forum, vol. 26, pp. 214–226. Wiley Online Library (2007)Google Scholar
  23. 23.
    Shao, T., Xu, W., Zhou, K., Wang, J., Li, D., Guo, B.: An interactive approach to semantic modeling of indoor scenes with an RGBD camera. ACM Trans. Graph. (TOG) 31(6), 136 (2012)CrossRefGoogle Scholar
  24. 24.
    Sinha, S.N., Steedly, D., Szeliski, R., Agrawala, M., Pollefeys, M.: Interactive 3D architectural modeling from unordered photo collections. In: ACM Transactions on Graphics (TOG), vol. 27, p. 159. ACM (2008)CrossRefGoogle Scholar
  25. 25.
    Stein, S.C., Wörgötter, F., Schoeler, M., Papon, J., Kulvicius, T.: Convexity based object partitioning for robot applications. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3213–3220. IEEE (2014)Google Scholar
  26. 26.
    Xiao, J., Russell, B., Torralba, A.: Localizing 3D cuboids in single-view images. In: Advances in Neural Information Processing Systems, pp. 746–754 (2012)Google Scholar
  27. 27.
    Zhang, Y., Luo, C., Liu, J.: Walk&sketch: create floor plans with an RGB-D camera. In: Proceedings of the 2012 ACM Conference on Ubiquitous Computing, pp. 461–470. ACM (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Masashi Mishima
    • 1
    Email author
  • Hideaki Uchiyama
    • 1
  • Diego Thomas
    • 1
  • Rin-ichiro Taniguchi
    • 1
  • Rafael Roberto
    • 2
  • João Paulo Lima
    • 2
    • 3
  • Veronica Teichrieb
    • 2
  1. 1.Kyushu UniversityFukuokaJapan
  2. 2.Universidade Federal de PernambucoRecifeBrazil
  3. 3.Universidade Federal Rural de PernambucoRecifeBrazil

Personalised recommendations