Automatic Collision Avoidance Using Deep Reinforcement Learning with Grid Sensor

Sawada, Ryohei

doi:10.1007/978-3-030-37442-6_3

Ryohei Sawada⁶

Part of the book series: Proceedings in Adaptation, Learning and Optimization ((PALO,volume 12))

Included in the following conference series:

Symposium on Intelligent and Evolutionary Systems

403 Accesses
5 Citations

Abstract

This paper presents an automatic collision avoidance algorithm for multiple ships using reinforcement learning (RL). Obstacle zone by target (OZT) is used to grasp multiple ships’ dynamic information in the form of 2-dimensional areas. OZT shows a dangerous area where collisions may happen. Then a new method using a virtual sensor which is separated in a grid is proposed to detect multiple ships simultaneously. The sensor detects OZTs efficiently and provides information about where OZTs expands as a part of a state vector with a fixed dimension as inputs for a RL algorithm. I applied a deep RL algorithm. An agent of deep RL learned manoeuvre using a set of ship encounter situations called Imazu problem. The learned model can avoid all encounter situations of up to three target ships in simulations. The proposed approach can learn manoeuvre to manage both waypoint navigation and collision avoidance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

IMO takes first steps to address autonomous ships (2018). http://www.imo.org
Varas, J.M., Hirdaris, S., Smith, R., Scialla, P., Caharija, W., Bhuiyan, Z., Mills, T., Naeem, W., Hu, L., Renton, I., Motson, D., Rajabally, E.: MAXCMAS project: autonomous COLREGs Compliant Ship Navigation. In: Proceedings of the 16th Conference on Computer Applications and Information Technology in the Maritime Industries 2017, pp. 454-464 (2017)
Google Scholar
Imazu, H.: Computation of OZT by using collision course. Navigation 188, 78–81 (2014)
Google Scholar
Imazu, H.: Evaluation method of collision risk by using true motion. Int. J. Mar. Navig. Saf. Sea Transp. (TransNav) 11(1), 65–70 (2017)
Article Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Proximal policy optimization algorithms, arXiv preprint arXiv:1707.06347 (2017)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning, PMLR, vol. 48, pp. 1928-1937 (2016)
Google Scholar
Imazu, H., Koyama, T.: The optimization of the criterion for collision avoidance action-II. J. Jpn. Inst. Navig. 72, 23–30 (1985)
Article Google Scholar
Imazu, H., Koyama, T.: The optimization of the criterion for collision avoidance action-III. J. Jpn. Inst. Navig. 73, 19–26 (1985)
Article Google Scholar
Kouzuki, A., Hasegawa, K.: Automatic collision avoidance system for ships using fuzzy control. J. Kansai Soc. Nav. Arch. Jpn. 205, 1–10 (1987)
Google Scholar
Cai, Y., Hasegawa, K.: Evaluating of marine traffic simulation system through imazu problem. In: The Proceedings of Japan Society of Naval Architecture and Ocean Engineering, vol. 17, pp. 191–194 (2013)
Google Scholar
Imazu, H.: Research on collision avoidance manoeuvre (in Japanese). Ph.D. thesis. University of Tokyo, Japan (1987)
Google Scholar
Hu, L., Naeem, W., Rajabally, E., Watson, G., Mills, T., Bhuiyan, Z., Salter, I.: COLREGs-compliant path planning for autonomous surface vehicles: a multi-objective optimization approach. In: 20th IFAC World Congress, vol. 50, pp. 13662–13667 (2017)
Google Scholar
Nagasawa, A., Hara, K., Inoue, K.: The subjective difficulties of the situation of collision avoidance-I: toward the rating by simulation. J. Jpn. Inst. Navig. 79, 91–100 (1988)
Article Google Scholar
Nagasawa, A., Hara, K., Inoue, K., Kose, K.: The subjective difficulties of the situation of collision avoidance-II: toward the rating by simulation. J. Jpn. Inst. Navig. 88, 137–144 (1993)
Article Google Scholar
Taniguchi, Y., Matsuda, A., Sera, W., Terada, D., Hashimoto, H.: Validation of a ship collision avoidance algorithm in congested sea area by means of model experiment. In: The Proceedings of Japan Society of Naval Architecture and Ocean Engineering, vol. 23, pp. 627–632 (2016)
Google Scholar
Kuwata, Y., Wolf, M.T., Zarzhitsky, D., Huntsberger, T.L.: Safe maritime autonomous navigation with COLREGs, using velocity obstacles. IEEE J. Ocean. Eng. 39, 110–119 (2014)
Article Google Scholar
Mitsubori, K., Kamio, T., Tanaka, T.: Finding the course and collision avoidance based on reinforcement learning algorithm. Navigation 170, 26–31 (2009)
Google Scholar
Shen, H., Hashimoto, H., Matsuda, A., Taniguchi, Y., Terada, D., Guo, C.: Automatic collision avoidance of multiple ships based on deep Q-learning. Appl. Ocean Res. 86, 268–288 (2019)
Article Google Scholar
Rachman, A.S.A.: 3D-LIDAR multi object tracking for autonomous driving: multi-target detection and tracking under urban road uncertainties, Master thesis, Delft University of Technology, Netherlands (2017)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations, International Conference on Learning Representations (ICLR) (2015)
Google Scholar
DeepX, Inc.: Machina a library for real-world deep reinforcement learning (2019). https://machina-rl.org/
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic Differentiation in PyTorch, NIPS Autodiff Workshop (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

National Maritime Research Institute, National Institute of Maritime, Port and Aviation Technology, Tokyo, Japan
Ryohei Sawada

Authors

Ryohei Sawada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryohei Sawada .

Editor information

Editors and Affiliations

Department of Computer Science, National Defense Academy of Japan, Yokosuka-shi, Japan
Hiroshi Sato
Faculty of Maritime Safety Technology, Japan Coast Guard Academy, Wakabacho, Japan
Saori Iwanaga
Department of Applied Mathematics and Physics, Tottori University, Tottori, Japan
Akira Ishii

A Appendix

The configuration of the networks and the hyper-parameters of PPO in this paper are shown in Table 3. Machina supports only Ubuntu and doesn’t supports Windows officially. Therefore, some implementations in the machina code such as the use of multiprocessing in Pytorch and Python and the log output format have been modified for Windows.

Table 3. Hyper-parameters values for PPO

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sawada, R. (2020). Automatic Collision Avoidance Using Deep Reinforcement Learning with Grid Sensor. In: Sato, H., Iwanaga, S., Ishii, A. (eds) Proceedings of the 23rd Asia Pacific Symposium on Intelligent and Evolutionary Systems. IES 2019. Proceedings in Adaptation, Learning and Optimization, vol 12. Springer, Cham. https://doi.org/10.1007/978-3-030-37442-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-37442-6_3
Published: 05 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-37441-9
Online ISBN: 978-3-030-37442-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Automatic Collision Avoidance Using Deep Reinforcement Learning with Grid Sensor

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation