Skip to main content

Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning

  • Conference paper
  • First Online:
Swarm Intelligence (ANTS 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11172))

Included in the following conference series:

Abstract

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. While it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building and building a communication link. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alonso-Mora, J., Montijano, E., Schwager, M., Rus, D.: Distributed multi-robot formation control among obstacles: a geometric and optimization approach with consensus. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 5356–5363 (2016)

    Google Scholar 

  2. Arvin, F., Murray, J., Zhang, C., Yue, S.: Colias: an autonomous micro robot for swarm robotic applications. Int. J. Adv. Robot. Syst. 11(7), 113 (2014)

    Article  Google Scholar 

  3. Basu, P., Redi, J.: Movement control algorithms for realization of fault-tolerant ad hoc robot networks. IEEE Netw. 18(4), 36–44 (2004)

    Article  Google Scholar 

  4. Bayındır, L.: A review of swarm robotics tasks. Neurocomputing 172(C), 292–321 (2016)

    Article  Google Scholar 

  5. Chen, J., Gauci, M., Groß, R.: A strategy for transporting tall objects with a swarm of miniature mobile robots. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 863–869 (2013)

    Google Scholar 

  6. Correll, N., Martinoli, A.: Modeling and designing self-organized aggregation in a swarm of miniature robots. Int. J. Robot. Res. 30(5), 615–626 (2011)

    Article  Google Scholar 

  7. Foerster, J., Assael, Y.M., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. Adv. Neural Inf. Process. Syst. 29, 2137–2145 (2016)

    Google Scholar 

  8. Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent policy gradients. arXiv:1705.08926 (2017)

  9. Goldberg, D., Mataric, M.J.: Robust behavior-based control for distributed multi-robot collection tasks (2000)

    Google Scholar 

  10. Gu, S., Lillicrap, T., Ghahramani, Z., Turner, R.E., Levine, S.: Q-prop: sample-efficient policy gradient with an off-policy critic. In: Proceedings of the 5th International Conference on Learning Representations (2017)

    Google Scholar 

  11. Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: Sukthankar, G., Rodriguez-Aguilar, J.A. (eds.) AAMAS 2017. LNCS (LNAI), vol. 10642, pp. 66–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71682-4_5

    Chapter  Google Scholar 

  12. Hoff, N.R., Sagoff, A., Wood, R.J., Nagpal, R.: Two foraging algorithms for robot swarms using only local communication. In: Proceedings of the IEEE International Conference on Robotics and Biomimetics, pp. 123–130 (2010)

    Google Scholar 

  13. Kube, C., Bonabeau, E.: Cooperative transport by ants and robots. Robot. Auton. Syst. 30(1), 85–101 (2000)

    Article  Google Scholar 

  14. Lillicrap, T.P., et al.: Continuous control with deep reinforcement learning. arXiv:1509.02971 (2015)

  15. Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. arXiv:1706.02275 (2017)

  16. Martinoli, A., Easton, K., Agassounon, W.: Modeling swarm robotic systems: a case study in collaborative distributed manipulation. Int. J. Robot. Res. 23(4–5), 415–436 (2004)

    Article  Google Scholar 

  17. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)

    Article  Google Scholar 

  18. Moeslinger, C., Schmickl, T., Crailsheim, K.: Emergent flocking with low-end swarm robots. In: Dorigo, M., et al. (eds.) ANTS 2010. LNCS, vol. 6234, pp. 424–431. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15461-4_40

    Chapter  Google Scholar 

  19. Nouyan, S., Gross, R., Bonani, M., Mondada, F., Dorigo, M.: Teamwork in self-organized robot colonies. IEEE Trans. Evol. Comput. 13(4), 695–711 (2009)

    Article  Google Scholar 

  20. Oliehoek, F.A.: Decentralized POMDPs. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning. Adaptation, Learning, and Optimization, vol. 12, pp. 471–503. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27645-3_15

    Chapter  Google Scholar 

  21. Schulman, J., Levine, S., Moritz, P., Jordan, M., Abbeel, P.: Trust region policy optimization. In: Proceedings of the 32nd International Conference on Machine Learning, pp. 1889–1897 (2015)

    Google Scholar 

  22. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv:1707.06347 (2017)

  23. Teh, Y.W., et al.: Distral: robust multitask reinforcement learning. arXiv:1707.04175 (2017)

  24. Šošić, A., KhudaBukhsh, W.R., Zoubir, A.M., Koeppl, H.: Inverse reinforcement learning in swarm systems. In: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, pp. 1413–1421 (2017)

    Google Scholar 

  25. Witkowski, U., et al.: Ad-hoc network communication infrastructure for multi-robot systems in disaster scenarios. In: Proceedings of the IARP/EURON Workshop on Robotics for Risky Interventions and Environmental Surveillance (2008)

    Google Scholar 

Download references

Acknowledgments

The research leading to these results has received funding from EPSRC under grant agreement EP/R02572X/1 (National Center for Nuclear Robotics). Calculations for this research were conducted on the Lichtenberg high performance computer of the TU Darmstadt.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maximilian Hüttenrauch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hüttenrauch, M., Šošić, A., Neumann, G. (2018). Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning. In: Dorigo, M., Birattari, M., Blum, C., Christensen, A., Reina, A., Trianni, V. (eds) Swarm Intelligence. ANTS 2018. Lecture Notes in Computer Science(), vol 11172. Springer, Cham. https://doi.org/10.1007/978-3-030-00533-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00533-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00532-0

  • Online ISBN: 978-3-030-00533-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics