Structure Aware SLAM Using Quadrics and Planes

Hosseinzadeh, Mehdi; Latif, Yasir; Pham, Trung; Suenderhauf, Niko; Reid, Ian

doi:10.1007/978-3-030-20893-6_26

Mehdi Hosseinzadeh^18,20,
Yasir Latif^18,20,
Trung Pham²¹,
Niko Suenderhauf^19,20 &
…
Ian Reid^18,20

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11363))

Included in the following conference series:

Asian Conference on Computer Vision

3655 Accesses
26 Citations

Abstract

Simultaneous Localization And Mapping (SLAM) is a fundamental problem in mobile robotics. While point-based SLAM methods provide accurate camera localization, the generated maps lack semantic information. On the other hand, state of the art object detection methods provide rich information about entities present in the scene from a single image. This work marries the two and proposes a method for representing generic objects as quadrics which allows object detections to be seamlessly integrated in a SLAM framework. For scene coverage, additional dominant planar structures are modeled as infinite planes. Experiments show that the proposed points-planes-quadrics representation can easily incorporate Manhattan and object affordance constraints, greatly improving camera localization and leading to semantically meaningful maps.

Supported by the ARC Fellowship FL130100102 to IR and the ACRV CE140100016.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Comparison for RMSE of relative errors, RTE and RRE, as well as run-time analysis are reported in the supplementary material.

References

Bao, S.Y., Bagra, M., Chao, Y.W., Savarese, S.: Semantic structure from motion with points, regions, and objects. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (2012)
Google Scholar
Cadena, C., et al.: Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans. Robot. 32(6), 1309–1332 (2016)
Article Google Scholar
Cross, G., Zisserman, A.: Quadric reconstruction from dual-space geometry. In: 1998 Sixth International Conference on Computer Vision, pp. 25–31. IEEE (1998)
Google Scholar
Dellaert, F., Kaess, M.: Factor graphs for robot perception. Found. Trends Robot. 6(1–2), 1–139 (2017). https://doi.org/10.1561/2300000043
Article Google Scholar
Eigen, D., Fergus, R.: Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, 7–13 December 2015, pp. 2650–2658 (2015). https://doi.org/10.1109/ICCV.2015.304
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. 40, 611–625 (2017)
Article Google Scholar
Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10605-2_54
Chapter Google Scholar
Forster, C., Pizzoli, M., Scaramuzza, D.: SVO: fast semi-direct monocular visual odometry. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 15–22. IEEE (2014)
Google Scholar
Gálvez-López, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)
Article Google Scholar
Gay, P., Bansal, V., Rubino, C., Bue, A.D.: Probabilistic structure from motion with objects (PSfMO). In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 3094–3103, October 2017. https://doi.org/10.1109/ICCV.2017.334
Gee, A.P., Mayol-Cuevas, W.: Real-time model-based SLAM using line segments. In: Bebis, G., et al. (eds.) ISVC 2006. LNCS, vol. 4292, pp. 354–363. Springer, Heidelberg (2006). https://doi.org/10.1007/11919629_37
Chapter Google Scholar
Gomez-Ojeda, R., Moreno, F.A., Scaramuzza, D., Gonzalez-Jimenez, J.: PL-SLAM: a stereo SLAM system through the combination of points and line segments. arXiv preprint arXiv:1705.09479 (2017)
Grisetti, G., Kummerle, R., Stachniss, C., Burgard, W.: A tutorial on graph-based SLAM. IEEE Intell. Transp. Syst. Mag. 2(4), 31–43 (2010)
Article Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cambridge University Press, New York (2003)
MATH Google Scholar
Kaess, M.: Simultaneous localization and mapping with infinite planes. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 4605–4611. IEEE (2015)
Google Scholar
Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g\(^2\)o: a general framework for graph optimization. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 3607–3613. IEEE (2011)
Google Scholar
Lemaire, T., Lacroix, S.: Monocular-vision based SLAM using line segments. In: 2007 IEEE International Conference on Robotics and Automation, pp. 2791–2796. IEEE (2007)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
McCormac, J., Handa, A., Davison, A., Leutenegger, S.: SemanticFusion: dense 3D semantic mapping with convolutional neural networks. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635. IEEE (2017)
Google Scholar
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Article Google Scholar
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Chapter Google Scholar
Newcombe, R.A., et al.: KinectFusion: real-time dense surface mapping and tracking. In: 2011 10th IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 127–136. IEEE (2011)
Google Scholar
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: DTAM: dense tracking and mapping in real-time. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2320–2327. IEEE (2011)
Google Scholar
Prisacariu, V.A., et al.: InfiniTAM v3: a framework for large-scale 3D reconstruction with loop closure. arXiv preprint arXiv:1708.00783 (2017)
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: PL-SLAM: real-time monocular visual SLAM with points and lines. In: Proceedings of the International Conference on Robotics and Automation (ICRA). IEEE (2017)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
Rubino, C., Crocco, M., Bue, A.D.: 3D object localisation from multi-view image detections. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2018). https://doi.org/10.1109/TPAMI.2017.2701373
Salas-Moreno, R.F., Glocken, B., Kelly, P.H.J., Davison, A.J.: Dense planar SLAM. In: 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR), pp. 157–164, September 2014. https://doi.org/10.1109/ISMAR.2014.6948422
Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H.J., Davison, A.J.: SLAM++: simultaneous localisation and mapping at the level of objects. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013, pp. 1352–1359 (2013). https://doi.org/10.1109/CVPR.2013.178
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: Proceedings of the International Conference on Intelligent Robot Systems (IROS), October 2012
Google Scholar
Sünderhauf, N., Milford, M.: Dual quadrics from object detection boundingboxes as landmark representations in SLAM. Preprints arXiv:1708.00965, August 2017
Sünderhauf, N., Pham, T.T., Latif, Y., Milford, M., Reid, I.: Meaningful maps with object-oriented semantic mapping. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5079–5085. IEEE (2017)
Google Scholar
Taguchi, Y., Jian, Y.D., Ramalingam, S., Feng, C.: Point-plane SLAM for hand-held 3D sensors. In: 2013 IEEE International Conference on Robotics and Automation (ICRA), pp. 5182–5189. IEEE (2013)
Google Scholar
Trevor, A., Gedikli, S., Rusu, R., Christensen, H.: Efficient organized point cloud segmentation with connected components. In: 3rd Workshop on Semantic Perception Mapping and Exploration (SPME), Karlsruhe, Germany (2013)
Google Scholar
Yang, S., Song, Y., Kaess, M., Scherer, S.: Pop-up SLAM: semantic monocular plane SLAM for low-texture environments. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1222–1229. IEEE (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

The University of Adelaide, Adelaide, Australia
Mehdi Hosseinzadeh, Yasir Latif & Ian Reid
Queensland University of Technology, Brisbane, Australia
Niko Suenderhauf
Australian Centre for Robotic Vision, Brisbane, Australia
Mehdi Hosseinzadeh, Yasir Latif, Niko Suenderhauf & Ian Reid
NVIDIA, Santa Clara, CA, 95051, USA
Trung Pham

Authors

Mehdi Hosseinzadeh
View author publications
You can also search for this author in PubMed Google Scholar
Yasir Latif
View author publications
You can also search for this author in PubMed Google Scholar
Trung Pham
View author publications
You can also search for this author in PubMed Google Scholar
Niko Suenderhauf
View author publications
You can also search for this author in PubMed Google Scholar
Ian Reid
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehdi Hosseinzadeh .

Editor information

Editors and Affiliations

IIIT Hyderabad, Hyderabad, India
C. V. Jawahar
ANU, Canberra, ACT, Australia
Hongdong Li
Simon Fraser University, Burnaby, BC, Canada
Greg Mori
ETH Zurich, Zurich, Zürich, Switzerland
Konrad Schindler

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 39976 KB)

Supplementary material 2 (pdf 4841 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hosseinzadeh, M., Latif, Y., Pham, T., Suenderhauf, N., Reid, I. (2019). Structure Aware SLAM Using Quadrics and Planes. In: Jawahar, C., Li, H., Mori, G., Schindler, K. (eds) Computer Vision – ACCV 2018. ACCV 2018. Lecture Notes in Computer Science(), vol 11363. Springer, Cham. https://doi.org/10.1007/978-3-030-20893-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-20893-6_26
Published: 29 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20892-9
Online ISBN: 978-3-030-20893-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics