Collision Avoidance for Indoor Service Robots Through Multimodal Deep Reinforcement Learning

  • Francisco LeivaEmail author
  • Kenzo Lobos-Tsunekawa
  • Javier Ruiz-del-Solar
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11531)


In this paper, we propose an end-to-end approach to endow indoor service robots with the ability to avoid collisions using Deep Reinforcement Learning (DRL). The proposed method allows a controller to derive continuous velocity commands for an omnidirectional mobile robot using depth images, laser measurements, and odometry based speed estimations. The controller is parameterized by a deep neural network, and trained using DDPG. To improve the limited perceptual range of most indoor robots, a method to exploit range measurements through sensor integration and feature extraction is developed. Additionally, to alleviate the reality gap problem due to training in simulations, a simple processing pipeline for depth images is proposed. As a case study we consider indoor collision avoidance using the Pepper robot. Through simulated testing we show that our approach is able to learn a proficient collision avoidance policy from scratch. Furthermore, we show empirically the generalization capabilities of the trained policy by testing it in challenging real-world environments. Videos showing the behavior of agents trained using the proposed method can be found at



This work was partially funded by FONDECYT Project 1161500 and CONICYT-PFCHA/Magíster Nacional/2018-22182130.


  1. 1.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available from
  2. 2.
    Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI 8(6), 679–698 (1986). Scholar
  3. 3.
    Duda, R.O., Hart, P.E.: Use of the hough transformation to detect lines and curves in pictures. Commun. ACM 15(1), 11–15 (1972). Scholar
  4. 4.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). Scholar
  5. 5.
    Kahn, G., Villaflor, A., Ding, B., Abbeel, P., Levine, S.: Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8, May 2018.
  6. 6.
    Kim, S., Kim, M., Ho, Y.: Depth image filter for mixed and noisy pixel removal in rgb-d camera systems. IEEE Trans. Consum. Electron. 59(3), 681–689 (2013). Scholar
  7. 7.
    Koenig, N., Howard, A.: Design and use paradigms for gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566), vol. 3, pp. 2149–2154, September 2004.
  8. 8.
    Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. In: ICLR (2016)Google Scholar
  9. 9.
    Liu, G.H., Siravuru, A., Prabhakar, S., Veloso, M., Kantor, G.: Learning end-to-end multimodal sensor policies for autonomous navigation. In: Proceedings of the 1st Annual Conference on Robot Learning. Proceedings of Machine Learning Research, 13–15 November 2017, vol. 78, pp. 249–261. PMLR (2017)Google Scholar
  10. 10.
    Lobos-Tsunekawa, K., Leiva, F., Ruiz-del-Solar, J.: Visual navigation for biped humanoid robots using deep reinforcement learning. IEEE Rob. Autom. Lett. 3(4), 3247–3254 (2018). Scholar
  11. 11.
    Nguyen, C.V., Izadi, S., Lovell, D.: Modeling kinect sensor noise for improved 3d reconstruction and tracking. In: 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission, pp. 524–530, October 2012.
  12. 12.
    Pandey, A.K., Gelin, R.: A mass-produced sociable humanoid robot: pepper: the first machine of its kind. IEEE Rob. Autom. Mag. 25(3), 40–48 (2018). Scholar
  13. 13.
    Patel, N., Choromanska, A., Krishnamurthy, P., Khorrami, F.: A deep learning gated architecture for UGV navigation robust to sensor failures. Rob. Auton. Syst. 116, 80–97 (2019). Scholar
  14. 14.
    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. CoRR abs/1612.00593 (2016).
  15. 15.
    Quigley, M., et al.: ROS: an open-source robot operating system. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) Workshop on Open Source Robotics, Kobe, Japan, May 2009Google Scholar
  16. 16.
    Sampedro, C., Bavle, H., Rodriguez-Ramos, A., de la Puente, P., Campoy, P.: Laser-based reactive navigation for multirotor aerial robots using deep reinforcement learning. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1024–1031, October 2018.
  17. 17.
    Tai, L., Li, S., Liu, M.: A deep-network solution towards model-less obstacle avoidance. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2759–2764, October 2016.
  18. 18.
    Tai, L., Paolo, G., Liu, M.: Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 31–36, September 2017.
  19. 19.
    Telea, A.: An image inpainting technique based on the fast marching method. J. Graph. Tools 9(1), 23–34 (2004)CrossRefGoogle Scholar
  20. 20.
    Xie, L., Wang, S., Rosa, S., Markham, A., Trigoni, N.: Learning with training wheels: Speeding up training with a simple controller for deep reinforcement learning. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 6276–6283, May 2018.
  21. 21.
    Xie, L., Wang, S., Markham, A., Trigoni, N.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. CoRR abs/1706.09829 (2017).
  22. 22.
    Yang, L., Liang, X., Xing, E.P.: Unsupervised real-to-virtual domain unification for end-to-end highway driving. CoRR abs/1801.03458 (2018).
  23. 23.
    Yang, S., Konam, S., Ma, C., Rosenthal, S., Veloso, M.M., Scherer, S.: Obstacle avoidance through deep networks based intermediate perception. CoRR abs/1704.08759 (2017).

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Francisco Leiva
    • 1
    Email author
  • Kenzo Lobos-Tsunekawa
    • 1
  • Javier Ruiz-del-Solar
    • 1
    • 2
  1. 1.Department of Electrical EngineeringUniversidad de ChileSantiagoChile
  2. 2.Advanced Mining Technology Center (AMTC)Universidad de ChileSantiagoChile

Personalised recommendations