Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning

Hüttenrauch, Maximilian; Šošić, Adrian; Neumann, Gerhard

doi:10.1007/978-3-030-00533-7_6

Maximilian Hüttenrauch¹⁹,
Adrian Šošić²⁰ &
Gerhard Neumann¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11172))

Included in the following conference series:

International Conference on Swarm Intelligence

1892 Accesses
12 Citations

Abstract

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. While it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building and building a communication link. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alonso-Mora, J., Montijano, E., Schwager, M., Rus, D.: Distributed multi-robot formation control among obstacles: a geometric and optimization approach with consensus. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 5356–5363 (2016)
Google Scholar
Arvin, F., Murray, J., Zhang, C., Yue, S.: Colias: an autonomous micro robot for swarm robotic applications. Int. J. Adv. Robot. Syst. 11(7), 113 (2014)
Article Google Scholar
Basu, P., Redi, J.: Movement control algorithms for realization of fault-tolerant ad hoc robot networks. IEEE Netw. 18(4), 36–44 (2004)
Article Google Scholar
Bayındır, L.: A review of swarm robotics tasks. Neurocomputing 172(C), 292–321 (2016)
Article Google Scholar
Chen, J., Gauci, M., Groß, R.: A strategy for transporting tall objects with a swarm of miniature mobile robots. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 863–869 (2013)
Google Scholar
Correll, N., Martinoli, A.: Modeling and designing self-organized aggregation in a swarm of miniature robots. Int. J. Robot. Res. 30(5), 615–626 (2011)
Article Google Scholar
Foerster, J., Assael, Y.M., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. Adv. Neural Inf. Process. Syst. 29, 2137–2145 (2016)
Google Scholar
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. arXiv:1705.08926 (2017)
Goldberg, D., Mataric, M.J.: Robust behavior-based control for distributed multi-robot collection tasks (2000)
Google Scholar
Gu, S., Lillicrap, T., Ghahramani, Z., Turner, R.E., Levine, S.: Q-prop: sample-efficient policy gradient with an off-policy critic. In: Proceedings of the 5th International Conference on Learning Representations (2017)
Google Scholar
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5
Chapter Google Scholar
Hoff, N.R., Sagoff, A., Wood, R.J., Nagpal, R.: Two foraging algorithms for robot swarms using only local communication. In: Proceedings of the IEEE International Conference on Robotics and Biomimetics, pp. 123–130 (2010)
Google Scholar
Kube, C., Bonabeau, E.: Cooperative transport by ants and robots. Robot. Auton. Syst. 30(1), 85–101 (2000)
Article Google Scholar
Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275 (2017)
Martinoli, A., Easton, K., Agassounon, W.: Modeling swarm robotic systems: a case study in collaborative distributed manipulation. Int. J. Robot. Res. 23(4–5), 415–436 (2004)
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Moeslinger, C., Schmickl, T., Crailsheim, K.: Emergent flocking with low-end swarm robots. In: Dorigo, M., et al. (eds.) ANTS 2010. LNCS, vol. 6234, pp. 424–431. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15461-4_40
Chapter Google Scholar
Nouyan, S., Gross, R., Bonani, M., Mondada, F., Dorigo, M.: Teamwork in self-organized robot colonies. IEEE Trans. Evol. Comput. 13(4), 695–711 (2009)
Article Google Scholar
Oliehoek, F.A.: Decentralized POMDPs. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 471–503. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_15
Chapter Google Scholar
Schulman, J., Levine, S., Moritz, P., Jordan, M., Abbeel, P.: Trust region policy optimization. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 1889–1897 (2015)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017)
Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. arXiv:1707.04175 (2017)
Šošić, A., KhudaBukhsh, W.R., Zoubir, A.M., Koeppl, H.: Inverse reinforcement learning in swarm systems. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 1413–1421 (2017)
Google Scholar
Witkowski, U., et al.: Ad-hoc network communication infrastructure for multi-robot systems in disaster scenarios. In: Proceedings of the IARP/EURON Workshop on Robotics for Risky Interventions and Environmental Surveillance (2008)
Google Scholar

Download references

Acknowledgments

The research leading to these results has received funding from EPSRC under grant agreement EP/R02572X/1 (National Center for Nuclear Robotics). Calculations for this research were conducted on the Lichtenberg high performance computer of the TU Darmstadt.

Author information

Authors and Affiliations

School of Computer Science, University of Lincoln, Lincoln, UK
Maximilian Hüttenrauch & Gerhard Neumann
Department of Electrical Engineering, Technische Universität Darmstadt, Darmstadt, Germany
Adrian Šošić

Authors

Maximilian Hüttenrauch
View author publications
You can also search for this author in PubMed Google Scholar
Adrian Šošić
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Neumann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maximilian Hüttenrauch .

Editor information

Editors and Affiliations

Université Libre de Bruxelles, Brussels, Belgium
Marco Dorigo
Université Libre de Bruxelles, Brussels, Belgium
Mauro Birattari
Artificial Intelligence Research Institute, Bellaterra, Spain
Christian Blum
University of Southern Denmark, Odense, Denmark
Anders L. Christensen
University of Sheffield, Sheffield, UK
Andreagiovanni Reina
National Research Council, Rome, Italy
Vito Trianni

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hüttenrauch, M., Šošić, A., Neumann, G. (2018). Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning. In: Dorigo, M., Birattari, M., Blum, C., Christensen, A., Reina, A., Trianni, V. (eds) Swarm Intelligence. ANTS 2018. Lecture Notes in Computer Science(), vol 11172. Springer, Cham. https://doi.org/10.1007/978-3-030-00533-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-00533-7_6
Published: 03 October 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00532-0
Online ISBN: 978-3-030-00533-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics