Gradient Based Method for Symmetric and Asymmetric Multiagent Reinforcement Learning
A gradient based method for both symmetric and asymmetric multiagent reinforcement learning is introduced in this paper. Symmetric multiagent reinforcement learning addresses the problem with agents involved in the learning task having equal information states. Respectively, in asymmetric multiagent reinforcement learning, the information states are not equal, i.e. some agents (leaders) try to encourage agents with less information (followers) to select actions that lead to improved overall utility value for the leaders. In both cases, there is a huge number of parameters to learn and we thus need to use some parametric function approximation methods to represent the value functions of the agents. The method proposed in this paper is based on the VAPS framework that is extended to utilize the theory of Markov games, i.e. a natural basis of multiagent reinforcement learning.
KeywordsNash Equilibrium Matrix Game Stackelberg Equilibrium Nash Equilibrium Point Asymmetric Model
Unable to display preview. Download preview PDF.
- 1.Baird, L., Moore, A.: Gradient descent for general reinforcement learning. In: Kearns, M., Solla, S., Cohn, D. (eds.) Advances in Neural Information Processing Systems, Cambridge, MA, USA, vol. 11. MIT Press, Cambridge (1999)Google Scholar
- 4.Hu, J., Wellman, M.P.: Multiagent reinforcement learning: Theoretical framework and an algorithm. In: Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), Madison, Wisconsin, USA. Morgan Kaufmann Publishers, San Francisco (July 1998)Google Scholar