Advertisement

Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction

  • Guillaume SartorettiEmail author
  • Yue Wu
  • William Paivine
  • T. K. Satish Kumar
  • Sven Koenig
  • Howie Choset
Conference paper
Part of the Springer Proceedings in Advanced Robotics book series (SPAR, volume 9)

Abstract

Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy execution, in a fully-observable system. We show that the sum of experience of all agents can be leveraged to quickly train a collaborative policy that naturally scales to smaller and larger swarms. We demonstrate the applicability of our method on a multi-robot construction problem, where agents need to arrange simple block elements to build a user-specified structure. We present simulation results where swarms of various sizes successfully construct different test structures without the need for additional training.

Keywords

Multi-agent system Collaboration Distributed learning 

Notes

Acknowledgements

The research at the University of Southern California was supported by the National Science Foundation (NSF) under grant numbers 1409987, 1724392 and 1817189. Detailed comments from anonymous referees contributed to the presentation and quality of this paper.

References

  1. 1.
    Buoniu, L., Babuška, R., De Schutter, B., Srinivasan, D., Jain, L.C., Buoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. Stud. Comput. Intell. 310, 183–221 (2010)Google Scholar
  2. 2.
    Cai, T., Zhang, D., Kumar, S., Koenig, S., Ayanian, N.: Local search on trees and a framework for automated construction using multiple identical robots [Short paper]. In: International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 1301–1302 (2016)Google Scholar
  3. 3.
    Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)Google Scholar
  4. 4.
    Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent Policy Gradients (2017). arXiv:1705.08926
  5. 5.
    Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: International Conference on Machine Learning, pp. 227–234 (2002)Google Scholar
  6. 6.
    Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer (2017)Google Scholar
  7. 7.
    Hausknecht, M., Stone, P.: Deep Recurrent Q-Learning for Partially Observable MDPs, vol. 7, no. 1 (2015). arxiv:1507.06527
  8. 8.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  9. 9.
    Koenig, S., Kumar, S.: A case for collaborative construction as testbed for cooperative multi-agent planning. In: ICAPS-17 Scheduling and Planning Applications Workshop (2017)Google Scholar
  10. 10.
    Kumar, S., Jung, S., Koenig, S.: A tree-based algorithm for construction robots. In: International Conference on Automated Planning and Scheduling (2014)Google Scholar
  11. 11.
    Kumar, T.S., Jung, S.J., Koenig, S.: A tree-based algorithm for construction robots. In: International Conference on Automated Planning and Scheduling (2014)Google Scholar
  12. 12.
    Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)Google Scholar
  13. 13.
    Lau, Q.P., Lee, M.L., Hsu, W.: Coordination guided reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 215–222 (2012)Google Scholar
  14. 14.
    Le, H.M., Yue, Y., Carr, P.: Coordinated Multi-agent Imitation Learning (2017). arXiv:1703.03121
  15. 15.
    Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017)Google Scholar
  16. 16.
    Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1928–1937 (2016)Google Scholar
  17. 17.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with Deep Reinforcement Learning (2013). arXiv:1312.5602
  18. 18.
    Neto, G.: From single-agent to multi-agent reinforcement learning: foundational concepts and methods. In: Learning Theory Course (2005)Google Scholar
  19. 19.
    Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in Atari games. In: Advances in Neural Information Processing Systems, pp. 2863–2871 (2015)Google Scholar
  20. 20.
    Petersen, K., Nagpal, R., Werfel, J.: TERMES: an autonomous robotic system for three-dimensional collective construction. In: International Conference on Robotics: Science and Systems (2011)Google Scholar
  21. 21.
    Sartoretti, G., Shi, Y., Paivine, W., Travers, M., Choset, H.: Distributed learning for the decentralized control of articulated mobile robots. In: Accepted to ICRA 2018, Brisbane, Australia, 21–25 May 2018Google Scholar
  22. 22.
    Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized Experience Replay (2015). arXiv:1511.05952
  23. 23.
    Schwartz, H.M.: Multi-agent Machine Learning: A Reinforcement Approach. Wiley (2014)Google Scholar
  24. 24.
    Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. 1998. In: A Bradford Book (1998)Google Scholar
  25. 25.
    Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI Conference on Artificial Intelligence, pp. 2094–2100 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Guillaume Sartoretti
    • 1
    Email author
  • Yue Wu
    • 1
  • William Paivine
    • 1
  • T. K. Satish Kumar
    • 2
  • Sven Koenig
    • 2
  • Howie Choset
    • 1
  1. 1.Robotics InstituteCarnegie Mellon UniversityPittsburghUSA
  2. 2.University of Southern CaliforniaLos AngelesUSA

Personalised recommendations