Abstract
Inspired by recent advances in single agent reinforcement learning, this paper extends the single-agent asynchronous advantage actor-critic (A3C) algorithm to enable multiple agents to learn a homogeneous, distributed policy, where agents work together toward a common goal without explicitly interacting. Our approach relies on centralized policy and critic learning, but decentralized policy execution, in a fully-observable system. We show that the sum of experience of all agents can be leveraged to quickly train a collaborative policy that naturally scales to smaller and larger swarms. We demonstrate the applicability of our method on a multi-robot construction problem, where agents need to arrange simple block elements to build a user-specified structure. We present simulation results where swarms of various sizes successfully construct different test structures without the need for additional training.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Buoniu, L., Babuška, R., De Schutter, B., Srinivasan, D., Jain, L.C., Buoniu, L., Babuška, R., De Schutter, B.: Multi-agent reinforcement learning: an overview. Stud. Comput. Intell. 310, 183–221 (2010)
Cai, T., Zhang, D., Kumar, S., Koenig, S., Ayanian, N.: Local search on trees and a framework for automated construction using multiple identical robots [Short paper]. In: International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), pp. 1301–1302 (2016)
Foerster, J., Assael, I.A., de Freitas, N., Whiteson, S.: Learning to communicate with deep multi-agent reinforcement learning. In: Advances in Neural Information Processing Systems, pp. 2137–2145 (2016)
Foerster, J., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S.: Counterfactual multi-agent Policy Gradients (2017). arXiv:1705.08926
Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: International Conference on Machine Learning, pp. 227–234 (2002)
Gupta, J.K., Egorov, M., Kochenderfer, M.: Cooperative multi-agent control using deep reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 66–83. Springer (2017)
Hausknecht, M., Stone, P.: Deep Recurrent Q-Learning for Partially Observable MDPs, vol. 7, no. 1 (2015). arxiv:1507.06527
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Koenig, S., Kumar, S.: A case for collaborative construction as testbed for cooperative multi-agent planning. In: ICAPS-17 Scheduling and Planning Applications Workshop (2017)
Kumar, S., Jung, S., Koenig, S.: A tree-based algorithm for construction robots. In: International Conference on Automated Planning and Scheduling (2014)
Kumar, T.S., Jung, S.J., Koenig, S.: A tree-based algorithm for construction robots. In: International Conference on Automated Planning and Scheduling (2014)
Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: AAAI Conference on Artificial Intelligence, pp. 2140–2146 (2017)
Lau, Q.P., Lee, M.L., Hsu, W.: Coordination guided reinforcement learning. In: International Conference on Autonomous Agents and Multiagent Systems, pp. 215–222 (2012)
Le, H.M., Yue, Y., Carr, P.: Coordinated Multi-agent Imitation Learning (2017). arXiv:1703.03121
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, O.P., Mordatch, I.: Multi-agent actor-critic for mixed cooperative-competitive environments. In: Advances in Neural Information Processing Systems, pp. 6382–6393 (2017)
Mnih, V., Badia, A.P., Mirza, M., Graves, A., Lillicrap, T., Harley, T., Silver, D., Kavukcuoglu, K.: Asynchronous methods for deep reinforcement learning. In: Proceedings of the International Conference on Machine Learning, pp. 1928–1937 (2016)
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with Deep Reinforcement Learning (2013). arXiv:1312.5602
Neto, G.: From single-agent to multi-agent reinforcement learning: foundational concepts and methods. In: Learning Theory Course (2005)
Oh, J., Guo, X., Lee, H., Lewis, R.L., Singh, S.: Action-conditional video prediction using deep networks in Atari games. In: Advances in Neural Information Processing Systems, pp. 2863–2871 (2015)
Petersen, K., Nagpal, R., Werfel, J.: TERMES: an autonomous robotic system for three-dimensional collective construction. In: International Conference on Robotics: Science and Systems (2011)
Sartoretti, G., Shi, Y., Paivine, W., Travers, M., Choset, H.: Distributed learning for the decentralized control of articulated mobile robots. In: Accepted to ICRA 2018, Brisbane, Australia, 21–25 May 2018
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized Experience Replay (2015). arXiv:1511.05952
Schwartz, H.M.: Multi-agent Machine Learning: A Reinforcement Approach. Wiley (2014)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. 1998. In: A Bradford Book (1998)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. In: AAAI Conference on Artificial Intelligence, pp. 2094–2100 (2016)
Acknowledgements
The research at the University of Southern California was supported by the National Science Foundation (NSF) under grant numbers 1409987, 1724392 and 1817189. Detailed comments from anonymous referees contributed to the presentation and quality of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sartoretti, G., Wu, Y., Paivine, W., Kumar, T.K.S., Koenig, S., Choset, H. (2019). Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction. In: Correll, N., Schwager, M., Otte, M. (eds) Distributed Autonomous Robotic Systems. Springer Proceedings in Advanced Robotics, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-030-05816-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-05816-6_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05815-9
Online ISBN: 978-3-030-05816-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)