Combining Planning with Reinforcement Learning for Multi-robot Task Allocation
- 886 Downloads
We describe an approach to the multi-robot task allocation (MRTA) problem in which a group of robots must perform tasks that arise continuously, at arbitrary locations across a large space. A dynamic scheduling algorithm is derived in which proposed plans are evaluated using a combination of short-term lookahead and a value function acquired by reinforcement learning. We demonstrate that this dynamic scheduler can learn not only to allocate robots to tasks efficiently, but also to position the robots appropriately in readiness for new tasks (tactical awareness), and conserve resources over the long run (strategic awareness).
KeywordsReinforcement Learn Planning Horizon Partially Observable Markov Decision Process Partial Observability Policy Search
Unable to display preview. Download preview PDF.
- 2.Gerkey, B.P., Mataric, M.J.: A formal framework for study of task allocation in multi-robot systems. Technical Report CRES-03-13, University of Southern Cali-fornia (2003)Google Scholar
- 4.Sutton, R.S., Barto, A.G.: Reinforcement Learning. MIT Press, Cambridge (1998)Google Scholar
- 5.Watkins, C.J.C.H.: Models of Delayed Reinforcement Learning. PhD thesis, Psy- chology Department, Cambridge University, Cambridge, United Kingdom (1989)Google Scholar
- 8.Meuleau, N., Hauskrecht, M., Kim, K.E., Peshkin, L., Kaelbling, L.P., Dean, T., Boutilier, C.: Solving very large weakly coupled Markov decision processes. In: Proceedings of the 15th National Conference on Artificial Intelligence (AAAI 1998), pp. 165–172. AAAI Press, Menlo Park (1998)Google Scholar