Skip to main content

Integration of CNN into a Robotic Architecture to Build Semantic Maps of Indoor Environments

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11507))

Abstract

In robotics, semantic mapping refers to the construction of a rich representation of the environment that includes high level information needed by the robot to accomplish its tasks. Building a semantic map requires algorithms to process sensor data at different levels: geometric, topological and object detections/categories, which must be integrated into an unified model. This paper describes a robotic architecture that successfully builds such semantic maps for indoor environments. For this purpose, within a ROS-based ecosystem, we apply a state-of-the-art Convolutional Neural Network (CNN), concretely YOLOv3, for detecting objects in images. The detection results are placed within a geometric map of the environment making use of a number of modules of the architecture: robot localization, camera extrinsic calibration, data form a depth camera, etc. We demonstrate the suitability of the proposed framework by building semantic maps of several home environments from the Robot@Home dataset, using Unity 3D as a tool to visualize the maps as well as to provide future robotic developments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://wiki.ros.org/amcl.

  2. 2.

    http://wiki.ros.org/gmapping.

  3. 3.

    https://unity3d.com.

  4. 4.

    http://wiki.ros.org/rosbridge_suite.

References

  1. Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., Garcia-Rodriguez, J.: A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857 (2017)

  2. Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Semantic image segmentation with deep convolutional nets and fully connected CRFs. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  3. Ruiz-Sarmiento, J.R., Galindo, C., González-Jiménez, J.: Building multiversal semantic maps for mobile robot operation. Knowl.-Based Syst. 119, 257–272 (2017)

    Article  Google Scholar 

  4. Pinto, N., Cox, D.D., DiCarlo, J.J.: Why is real-world visual object recognition hard? PLOS Comput. Biol. 4(1), 1–6 (2008)

    Article  MathSciNet  Google Scholar 

  5. Ruiz-Sarmiento, J.R., Galindo, C., Gonzalez-Jimenez, J.: A survey on learning approaches for undirected graphical models. application to scene object recognition. Int. J. Approximate Reasoning 83(C), 434–451 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  6. Kasaei, S.H., Oliveira, M., Lim, G.H., Seabra Lopes, L., Tomé, A.M.: Interactive open-ended learning for 3D object recognition: an approach and experiments. J. Intell. Robot. Syst. 80, 537–553 (2015)

    Article  Google Scholar 

  7. Ruiz-Sarmiento, J.R., Galindo, C., Gonzalez-Jimenez, J.: UPGMpp: a software library for contextual object recognition. In: 3rd Workshop on Recognition and Action for Scene Understanding (REACTS) (2015)

    Google Scholar 

  8. Han, J., Zhang, D., Cheng, G., Liu, N., Xu, D.: Advanced deep-learning techniques for salient and category-specific object detection: a survey. IEEE Signal Process. Mag. 35(1), 84–100 (2018)

    Article  Google Scholar 

  9. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1137–1149 (2017)

    Article  Google Scholar 

  10. Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)

  11. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  12. Dai, J., Li, Y., He, K., Sun, J.: R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems NIPS2016, pp. 379–387. Curran Associates Inc., USA (2016)

    Google Scholar 

  13. Everingham, M., Eslami, S.M., Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes challenge: a retrospective. Int. J. Comput. Vis. 111(1), 98–136 (2015)

    Article  Google Scholar 

  14. Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  15. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  16. Kostavelis, I., Gasteratos, A.: Semantic mapping for mobile robotics tasks: a survey. Robot. Auton. Syst. 66, 86–103 (2015)

    Article  Google Scholar 

  17. Monroy, J., Ruiz-Sarmiento, J.R., Moreno, F.A., Melendez-Fernandez, F., Galindo, C., Gonzalez-Jimenez, J.: A semantic-based gas source localization with a mobile robot combining vision and chemical sensing. Sensors 18(12), 4174 (2018)

    Article  Google Scholar 

  18. Zender, H., Mozos, O.M., Jensfelt, P., Kruijff, G.J., Burgard, W.: Conceptual spatial representations for indoor mobile robots. Robot. Auton. Syst. 56(6), 493–502 (2008)

    Article  Google Scholar 

  19. Pronobis, A., Jensfelt, P.: Large-scale semantic mapping and reasoning with heterogeneous modalities. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 3515–3522, May 2012

    Google Scholar 

  20. Pangercic, D., Pitzer, B., Tenorth, M., Beetz, M.: Semantic object maps for robotic housework - representation, acquisition and use. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4644–4651, October 2012

    Google Scholar 

  21. Günther, M., Ruiz-Sarmiento, J.R., Galindo, C., Gonzalez-Jimenez, J., Hertzberg, J.: Context-aware 3D object anchoring for mobile robots. Robot. Auton. Syst. 110, 12–32 (2018)

    Article  Google Scholar 

  22. Quigley, M., et al.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software, vol. 3, p. 5. Kobe, Japan (2009)

    Google Scholar 

  23. Juliani, A., et al.: Unity: A general platform for intelligent agents. arXiv preprint arXiv:1809.02627 (2018)

  24. Codd-Downey, R., Forooshani, P.M., Speers, A., Wang, H., Jenkin, M.: From ROS to unity: leveraging robot and virtual environment middleware for immersive teleoperation. In: 2014 IEEE International Conference on Information and Automation (ICIA), pp. 932–936, July 2014

    Google Scholar 

  25. Hu, Y., Meng, W.: ROSUnitySim: development and experimentation of a real-time simulator for multi-unmanned aerial vehicle local planning. Simulation 92(10), 931–944 (2016)

    Article  Google Scholar 

  26. Ruiz-Sarmiento, J.R., Galindo, C., González-Jiménez, J.: Robot@home, a robotic dataset for semantic mapping of home environments. Int. J. Robot. Res. 36(2), 131–141 (2017)

    Article  Google Scholar 

  27. Fox, D.: KLD-sampling: adaptive particle filters. In: Advances in Neural Information Processing Systems, pp. 713–720 (2002)

    Google Scholar 

  28. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection, pp. 779–788 (2016)

    Google Scholar 

  29. Redmon, J., Farhadi, A.: Yolo9000: Better, faster, stronger. arXiv preprint arXiv:1612.08242 (2016)

  30. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection, pp. 2980–2988 (2017)

    Google Scholar 

  31. Bjelonic, M.: YOLO ROS: real-time object detection for ROS (2016–2018). https://github.com/leggedrobotics/darknet_ros

  32. Wang, V., Salim, F., Moskovits, P.: The Definitive Guide to HTML5 WebSocket, vol. 1. Springer, Heidelberg (2013). https://doi.org/10.1007/978-1-4302-4741-8

    Book  Google Scholar 

  33. González-Jiménez, J., Galindo, C., Ruiz-Sarmiento, J.: Technical improvements of the giraff telepresence robot based on users’ evaluation. In: 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication, pp. 827–832. IEEE (2012)

    Google Scholar 

  34. Schroff, F., Kalenichenko, D., Philbin, J.: FaceNet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 815–823, 07–12 June 2015

    Google Scholar 

  35. Ruiz-Sarmiento, J.R., Galindo, C., Monroy, J., Moreno, F.A., Gonzalez-Jimenez, J.: Ontology-based conditional random fields for object recognition. Knowl.-Based Syst. 168, 100–108 (2019)

    Article  Google Scholar 

  36. Ruiz-Sarmiento, J.R., Galindo, C., González-Jiménez, J.: Scene object recognition for mobile robots through semantic knowledge and probabilistic graphical models. Expert Syst. Appl. 42(22), 8805–8816 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the research projects WISER (DPI2017-84827-R), funded by the Spanish Government and financed by the European Regional Development’s funds (FEDER), MoveCare (ICT-26-2016b-GA-732158), funded by the European H2020 program, and by a postdoc contract from the I-PPIT program of the University of Málaga, and the UG PHD scholarship program from the University of Groningen.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Chaves .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chaves, D., Ruiz-Sarmiento, J.R., Petkov, N., Gonzalez-Jimenez, J. (2019). Integration of CNN into a Robotic Architecture to Build Semantic Maps of Indoor Environments. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20518-8_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20517-1

  • Online ISBN: 978-3-030-20518-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics