The optimization problem of a two-person zero-sum discounted Markov game having vector valued loss function is investigated. The optimization criterion of the game is made from the domination structure determined by some convex coneD. In this paper, the state space of the game process is a countable set and the action spaces of the players are compact metric spaces. We use the scalarization of loss function by a weighting factor for the players to establish the optimal strategies under certain assumptions on the loss function and the transition probability measure. So we prove that the saddle point of the resulting zero-sum discounted numerical game is aD-saddle point for an initial Markov game. Conversely, under some additional conditions, anyD-saddle point is a saddle point of a numerical loss function by a weighting factor. Further, the relations ofD-saddle point, sub-gradient of the upper support function, and super-gradient of the lower support function are discussed.
Markov game D-saddle point contraction operator
This is a preview of subscription content, log in to check access.
J. P. Aubin, Mathematical Methods of Game and Economic Theory. North-Holland, Amsterdam, 1979.MATHGoogle Scholar
P. Billingsley, Convergence of Probability Measures. Wiley, New York, 1968.MATHGoogle Scholar
K. Fan, Fixed point and minimax theorem in locally convex topological linear space. Proc. Nat. Acad. Sci. U.S.A.,88 (1952), 121–126.CrossRefGoogle Scholar