Improving Space Representation in Multiagent Learning via Tile Coding
Reinforcement learning is an efficient, widely used machine learning technique that performs well in problems that are characterized by a small number of states and actions. This is rarely the case in multiagent learning problems. For the multiagent case, standard approaches may not be adequate. As an alternative, it is possible to use techniques that generalize the state space to allow agents to learn through the use of abstractions. Thus, the focus of this work is to combine multiagent learning with a generalization technique, namely tile coding. This kind of method is key in scenarios where agents have a high number of states to explore. In the scenarios used to test and validate this approach, our results indicate that the proposed representation outperforms the tabular one and is then an effective alternative.
KeywordsReinforcement Learning Multiagent System Markov Decision Process Independent Learner Insertion Rate
Unable to display preview. Download preview PDF.
- 2.Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746–752 (1998)Google Scholar
- 3.Guestrin, C., Lagoudakis, M.G., Parr, R.: Coordinated reinforcement learning. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML), pp. 227–234. Morgan Kaufmann, San Francisco (2002)Google Scholar
- 6.Sherstov, A.A., Stone, P.: Improving action selection in MDP’s via knowledge transfer. In: Proceedings of the Twentieth National Conference on Artificial Intelligence (July 2005)Google Scholar
- 7.Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)Google Scholar
- 8.Sutton, R.S.: Generalization in reinforcement learning: Successful examples using sparse coding. In: Touretzky, D., Mozer, M., Hasselmo, M. (eds.) Advances in Neural Information Processing Systems, vol. 8, pp. 1038–1044. MIT Press, Cambridge (1996)Google Scholar
- 9.Waskow, S.J., Bazzan, A.L.C.: Reinforcement learning methods: Generalizing joint tasks. In: Proceedings of the 35th Latin-American Informatics Conference, CLEI, Pelotas, Brazil (September 2009)Google Scholar
- 10.Watkins, C.: Learning from Delayed Rewards. PhD thesis, University of Cambridge (1989)Google Scholar
- 11.Whiteson, S., Taylor, M.E., Stone, P.: Adaptive tile coding for value function approximation. Technical Report AI-TR-07-339, University of Texas at Austin (2007)Google Scholar