Abstract
The explicit programming of obstacle avoidance by an autonomous robot can be a computationally expensive undertaking. The application of reinforcement learning algorithms promises a reduction of programming effort. However, these algorithms build on iterative training processes and therefore are time-consuming. In order to overcome this drawback we propose to displace the training process to abstract simulation scenarios. In this study we trained four different reinforcement algorithms (Q-Learning, Deep-Q-Learning, Deep Deterministic Policy Gradient and Asynchronous Advantage-Actor-Critic) in different abstract simulation scenarios and transferred the learning results to an autonomous robot. Except for the Asynchronous Advantage-Actor-Critic we achieved good obstacle avoidance during the simulation. Without further real-world training the policies learned by Q-Learning and Deep-Q-Learning achieved immediately obstacle avoidance when transferred to an autonomous robot.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Siegwart, R., Nourbakhsh, I.R., Scaramuzza, D.: Introduction to Autonomous Mobile Robots. MIT Press, Cambridge (2011)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Rob. Res. 32(11), 1238–1274 (2013)
Azouaoui, O.: Reinforcement learning (RL) based collision avoidance approach for multiple Autonomous Robotic Systems (ARS). In: Proceedings 10th IEEE International Conference on Advanced Robotics, pp. 561–566 (2001)
Xie, L., Wang, S., Markham, A., Trigoni, N.: Towards monocular vision based obstacle avoidance through deep reinforcement learning. arXiv preprint arXiv:1706.09829 (2017)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1, No. 1. MIT Press, Cambridge (1998)
Thorndike, E.L.: Laws and hypotheses for behavior. In: Thorndike, E.L. (ed.) Animal Intelligence, pp. 241–281 (1970)
Needham, A., Barrett, T., Peterman, K.: A pick-me-up for infants’ exploratory skills: early simulated experiences reaching for objects using ‘sticky mittens’ enhances young infants’ object exploration skills. Infant Behav. Dev. 25(3), 279–295 (2002)
Skinner, B.F.: The evolution of verbal behavior. J. Exp. Anal. Behav. 45(1), 115–122 (1986)
Kuhl, P.K.: A new view of language acquisition. Proc. Nat. Acad. Sci. 97(22), 11850–11857 (2000)
Holroyd, C.B., Coles, M.G.: The neural basis of human error processing: reinforcement learning, dopamine, and the error-related negativity. Psychol. Rev. 109(4), 679 (2002)
Thrun, S.: An approach to learning mobile robot navigation. Rob. Autonom. Syst. 15(4), 301–319 (1995)
Peters, J., Schaal, S.: Reinforcement learning of motor skills with policy gradients. Neural Netw. 21(4), 682–697 (2008)
Szepesvári, C.: Reinforcement Learning Algorithms for MDPs. Morgan and Claypool Publishers, San Rafael (2010)
Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing atari with deep reinforcement learning. Technical report arXiv:1312.5602 [cs.LG], Deepmind Technologies (2013)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing 71(7), 1180–1190 (2008)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T.P., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, pp. 1928–1937 (2016)
Palamuttam, R., Chen, W.: Vision enhanced asynchronous advantage actor-critic on racing games. Methods 4, A3C (2017)
Mirowski, P., Pascanu, R., Viola, F., Soyer, H., Ballard, A., Banino, A., Denil, M., Goroshin, R., Sifre, L., Kavukcuoglu, K., Hadsell, R., Kumaran, D.: Learning to navigate in complex environments (2016). arXiv preprint arXiv:1611.03673
Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: OpenAI gym (2016). arXiv preprint arXiv:1606.01540
Racaniere, S., Weber, T., Reichert, D.P., Buesing, L., Guez, A., Rezende, D., Badia, A.P., Vinyals, O., Heess, N., Li, Y., Pascanu, R., Battaglia, P., Hassabis, D., Silver, D., Wierstra, D.: Imagination-augmented agents for deep reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 5694–5705 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Lenk, M., Hilsendegen, P., Müller, S.M., Rettig, O., Strand, M. (2019). Simulation and Transfer of Reinforcement Learning Algorithms for Autonomous Obstacle Avoidance. In: Strand, M., Dillmann, R., Menegatti, E., Ghidoni, S. (eds) Intelligent Autonomous Systems 15. IAS 2018. Advances in Intelligent Systems and Computing, vol 867. Springer, Cham. https://doi.org/10.1007/978-3-030-01370-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-030-01370-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-01369-1
Online ISBN: 978-3-030-01370-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)