Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance

Liu, Xiongqing; Jin, Yan

doi:10.1007/978-3-030-05363-5_17

Xiongqing Liu² &
Yan Jin²

Included in the following conference series:

International Conference on - Design Computing and Cognition

1017 Accesses
3 Citations

Abstract

It is often hard for a reinforcement learning (RL) agent to utilize previous experience to solve new similar but more complex tasks. In this research, we combine the transfer learning with reinforcement learning and investigate how the hyperparameters of both transfer learning and reinforcement learning impact the learning effectiveness and task performance in the context of autonomous robotic collision avoidance. A deep reinforcement learning algorithm was first implemented for a robot to learn, from its experience, how to avoid randomly generated single obstacles. After that the effect of transfer of previously learned experience was studied by introducing two important concepts, transfer belief—i.e., how much a robot should believe in its previous experience—and transfer period—i.e., how long the previous experience should be applied in the new context. The proposed approach has been tested for collision avoidance problems by altering transfer period. It is shown that transfer learnings on average had ~50% speed increase at ~30% competence levels, and there exists an optimal transfer period where the variance is the lowest and learning speed is the fastest.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bojarski M et al (2016) End to end learning for self-driving cars. arXiv: 1604.07316 [cs.LG]
Google Scholar
Casanova D, Tardioli C, Lemaître A (2014) Space debris collision avoidance using a three-filter sequence. Mon Not R Astron Soc 442(4):3235–3242
Article Google Scholar
Chen JX (2016) The evolution of computing: AlphaGo. Comput Sci Eng 18(4):4–7
Article Google Scholar
Churchland PS, Sejnowski TJ (2016) The computational brain. MIT Press, Cambridge
Google Scholar
Coates A, Huval B, Wang T, Wu D, Ng A (2013) Deep learning with COTS HPC systems. In: International conference on machine learning
Google Scholar
Chen YF, Liu M, Everett M, How JP (2016) Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning. arXiv: 1609.07845 [cs.MA]
Google Scholar
Dean J et al (2012) Large scale distributed deep networks. In: International conference on neural information processing systems. Curran Associates Inc., New York
Google Scholar
Dieleman S, Schrauwen B (2014) End-to-end learning for music audio. In: IEEE international conference on acoustics, speech and signal processing
Google Scholar
Ding Z, Nasrabadi N, Fu Y (2016) Task-driven deep transfer learning for image classification. In: IEEE international conference on acoustics, speech and signal processing
Google Scholar
Fahimi F, Nataraj C, Ashrafiuon H (2009) Real-time obstacle avoidance for multiple mobile robots. Robotica 27(2):189–198
Article Google Scholar
Fernandez F, Veloso M (2006) Probabilistic policy reuse in a reinforcement learning agent. In: International joint conference on autonomous agents and multiagent systems, vol 58, pp 720–727
Google Scholar
Frommberger L (2008) Learning to behave in space: a qualitative spatial representation for robot navigation with reinforcement learning. Int J Artif Intell Tools 17(03):465–482
Article Google Scholar
Fujii T, Arai Y, Asama H, Endo I (1998) Multilayered reinforcement learning for complicated collision avoidance problems. In: Proceedings 1998 IEEE international conference on robotics and automation, vol 3, pp 2186–2191
Google Scholar
Goerlandt F, Kujala P (2014) On the reliability and validity of ship–ship collision risk analysis in light of different perspectives on risk. Saf Sci 62:348–365
Article Google Scholar
Hinton G, Deng L, Yu D, Dahl GE, Mohamed A, Jaitly N (2012) A senior, deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig Process Mag 29(6):82–97
Article Google Scholar
Hinton G, Vinyals O, Dean J (2015) Distilling the knowledge in a neural network. arXiv: 1503.02531v1 [stat.ML]
Google Scholar
Hourtash AM, Hingwe P, Schena BM, Devengenzo RL (2016) U.S. Patent No. 9,492,235. U.S. Patent and Trademark Office, Washington, DC
Google Scholar
Keller J, Thakur D, Gallier J, Kumar V (2016) Obstacle avoidance and path intersection validation for UAS: a B-spline approach. In: IEEE international conference on unmanned aircraft systems, pp 420–429
Google Scholar
Khatib O (1986) Real-time obstacle avoidance for manipulators and mobile robots. Int J Robot Res 5(1)
Google Scholar
Krizhevsky A, Sutskever I, Hinton G (2012) ImageNet classification with deep convolutional neural networks. Commun ACM 60(2)
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Liu X, Jin Y (2018) Transfer reinforcement learning: task similarities and transfer strategies (in preparation)
Google Scholar
Machado T, Malheiro T, Monteiro S, Erlhagen W, Bicho E (2016) Multi-constrained joint transportation tasks by teams of autonomous mobile robots using a dynamical systems approach. In: 2016 IEEE international conference on robotics and automation (ICRA), pp 3111–3117
Google Scholar
March JG (1991) Exploration and exploitation in organizational learning. Organ Sci 2(1):71–87
Article Google Scholar
Mastellone S, Stipanovic D, Graunke C, Intlekofer K, Spong M (2008) Formation control and collision avoidance for multi-agent non-holonomic systems: theory and experiments. Int J Rob Res 27(1):107–126
Article Google Scholar
Matarić MJ (1997) Reinforcement learning in the multi-robot domain. In: Robot colonies. Springer, US, pp 73–83
Chapter Google Scholar
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing Atari with deep reinforcement learning. arXiv:1312.5602v1 [cs.LG]
Google Scholar
Mukhtar A, Xia L, Tang TB (2015) Vehicle detection techniques for collision avoidance systems: a review. IEEE Trans Intell Transp Syst 16(5):2318–2338
Article Google Scholar
Ohn-Bar E, Trivedi MM (2016) Looking at humans in the age of self-driving and highly automated vehicles. IEEE Trans Intell Veh 1(1):90–104
Article Google Scholar
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Article Google Scholar
Parisotto E, Ba JL, Salakhutdinov R (2016) Actor-mimic: deep multitask and transfer reinforcement learning. arXiv:1511.06342v4 [cs.LG]
Google Scholar
Schaul T, Quan J, Antonoglou I, Silver D (2016) Prioritized experience replay. International Conference on Learning Representations, 2016
Google Scholar
Shiomi M, Zanlungo F, Hayashi K, Kanda T (2014) Towards a socially acceptable collision avoidance for a mobile robot navigating among pedestrians using a pedestrian model. Int J Soc Robot 6(3):443–455
Article Google Scholar
Silver D et al (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484
Article Google Scholar
Tang S, Kumar V (2015) A complete algorithm for generating safe trajectories for multi-robot teams. In: International symposium on robotics research
Google Scholar
Taylor M, Stone P (2007) Cross-domain transfer for reinforcement learning. In: International conference on machine learning, ACM
Google Scholar
Torrey L, Shavlik J, Walker T, Maclin R (2006) Skill acquisition via transfer learning and advice taking. In: European conference on machine learning. Springer, Berlin
Google Scholar
van Hasselt H, Guez A, Silver D (2015) Deep reinforcement learning with double Q-learning. arXiv:1509.06461v3 [cs.LG]
Google Scholar
Wang FY, Zhang JJ, Zheng X, Wang X, Yuan Y, Dai X, Zhang J, Yang, L (2016). Where does AlphaGo go: from Church-Turing thesis to AlphaGo thesis and beyond. IEEE/CAA J Automatica Sin 3(2):113–120
Google Scholar
Wang Z, School T, Hessel M, van Haselt H, Lanctot M, de Freitas N (2016) Dueling network architectures for deep reinforcement learning. arXiv:1511.06581v3 [cs.LG]
Google Scholar
Watkins C (1989) Learning from delayed rewards. Doctoral dissertation, University of Cambridge, Cambridge
Google Scholar
Yu A, Palefsky-Smith R, Bedi R (2016) Deep reinforcement learning for simulated autonomous vehicle control
Google Scholar
Zou X, Alexander R, McDermid J (2016) On the validation of a UAV collision avoidance system developed by model-based optimization: challenges and a tentative partial solution. In: 2016 46th annual IEEE/IFIP international conference on dependable systems and networks workshop, pp 192–199
Google Scholar

Download references

Author information

Authors and Affiliations

University of Southern California, Los Angeles, USA
Xiongqing Liu & Yan Jin

Authors

Xiongqing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yan Jin .

Editor information

Editors and Affiliations

Department of Computer Science and School of Architecture, University of North Carolina at Charlotte, Charlotte, NC, USA
John S. Gero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, X., Jin, Y. (2019). Design of Transfer Reinforcement Learning Mechanisms for Autonomous Collision Avoidance. In: Gero, J. (eds) Design Computing and Cognition '18. DCC 2018. Springer, Cham. https://doi.org/10.1007/978-3-030-05363-5_17

Download citation

DOI: https://doi.org/10.1007/978-3-030-05363-5_17
Published: 08 January 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05362-8
Online ISBN: 978-3-030-05363-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics