Structural Abstraction Experiments in Reinforcement Learning
A challenge in applying reinforcement learning to large problems is how to manage the explosive increase in storage and time complexity. This is especially problematic in multi-agent systems, where the state space grows exponentially in the number of agents. Function approximation based on simple supervised learning is unlikely to scale to complex domains on its own, but structural abstraction that exploits system properties and problem representations shows more promise. In this paper, we investigate several classes of known abstractions: 1) symmetry, 2) decomposition into multiple agents, 3) hierarchical decomposition, and 4) sequential execution. We compare memory requirements, learning time, and solution quality empirically in two problem variations. Our results indicate that the most effective solutions come from combinations of structural abstractions, and encourage development of methods for automatic discovery in novel problem formulations.
KeywordsReinforcement Learning Joint Action Travel Salesman Problem Sequential Execution Hierarchical Decomposition
Unable to display preview. Download preview PDF.
- 1.Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
- 3.Utgoff, P.E., Stracuzzi, D.J.: Many-layered learning. In: Neural Computation. MIT Press Journals, Cambridge (2002)Google Scholar
- 6.Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, King’s College (1989)Google Scholar
- 7.Ravindran, B., Barto, A.G.: SMDP homomorphisms: An algebraic approach to abstraction in semi markov decision processes. In: Proc. of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI 2003), pp. 1011–1018 (2003)Google Scholar
- 9.Dean, T., Givan, R.: Model minimization in markov decision processes. In: AAAI/IAAI, 106–111 (1997)Google Scholar
- 12.Wolpert, D., Tumer, K.: An introduction to collective intelligence. Technical Report NASA-ARC-IC-99-63, NASA Ames Research Center, CA (1999)Google Scholar
- 14.Rohanimanesh, K., Mahadevan, S.: Learning to take concurrent actions. In: NIPS, pp. 1619–1626 (2002)Google Scholar
- 15.Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Sammut, C., Hoffmann, A. (eds.) Proceedings of the Nineteenth International Conference on Machine Learning, pp. 243–250. Morgan Kaufmann, San Francisco (2002)Google Scholar
- 16.Kaelbling, L.P.: Hierarchical learning in stochastic domains: Preliminary results. In: Machine Learning Proceedings of the Tenth International Conference, San Mateo, CA, pp. 167–173. Morgan Kaufmann, San Francisco (1993)Google Scholar