Abstract
In this chapter, an iterative ADP method is presented to solve a class of continuous-time nonlinear two-person zero-sum differential games. The idea is to use ADP technique to obtain the optimal control pair iteratively which makes the performance index function reach the saddle point of the zero-sum differential games. When the saddle point does not exist, the mixed optimal control pair is obtained to make the performance index function reach the mixed optimum. Rigid proofs are proposed to guarantee the control pair stabilize the nonlinear system. And the convergent property of the performance index function is also proved. Neural networks are used to approximate the performance index function, compute the optimal control policy and model the nonlinear system respectively for facilitating the implementation of the iterative ADP method. Two examples are given to demonstrate the validity of the proposed method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Jamshidi, M.: Large-Scale Systems-Modeling and Control. North-Holland, Amsterdam, The Netherlands (1982)
Chang, H., Marcus, S.: Two-person zero-sum markov games: receding horizon approach. IEEE Trans. Autom. Control 48(11), 1951–1961 (2003)
Chen, B., Tseng, C., Uang, H.: Fuzzy differential games for nonlinear stochastic systems: suboptimal approach. IEEE Trans. Fuzzy Syst. 10(2), 222–233 (2002)
Hwnag, K., Chiou, J., Chen, T.: Reinforcement learning in zero-sum Markov games for robot soccer systems. In: Proceedings of the 2004 IEEE International Conference on Networking, Sensing and Control Taipei, Taiwan, pp. 1110–1114 (2004)
Laraki, R., Solan, E.: The value of zero-sum stopping games in continuous time. SIAM J. Control Optim. 43(5), 1913–1922 (2005)
Leslie, D., Collins, E.: Individual Q-learning in normal form games. SIAM J. Control Optim. 44(2), 495–514 (2005)
Gu, D.: A differential game approach to formation control. IEEE Trans. Control Syst. Technol. 16(1), 85–93 (2008)
Basar, T., Olsder, G.: Dynamic Noncooperative Game Theory. Academic, New York (1982)
Altman, E., Basar, T.: Multiuser rate-based flow control. IEEE Trans. Commun. 46(7), 940–949 (1998)
Goebel, R.: Convexity in zero-sum differential games. In: Proceedings of IEEE Conference on Decision and Control, pp. 3964–3969 (2002)
Zhang, P., Deng, H., Xi, J.: On the value of two-person zero-sum linear quadratic differential games. In: Proceedings of the 44th IEEE Conference on Decision and Control, and the European Control Conference 2005 Seville, Spain, pp. 12–15 (2005)
Hua, X., Mizukami, K.: Linear-quadratic zero-sum differential games for generalized state space systems. IEEE Trans. Autom. Control 39(1), 143–147 (1994)
Jimenez, M., Poznyak, A.: Robust and adaptive strategies with pre-identification via sliding mode technique in LQ differential games. In: Proceedings of the 2006 American Control Conference Minneapolis, Minnesota, USA, pp. 14–16 (2006)
Engwerda, J.: Uniqueness conditions for the affine open-loop linear quadratic differential game. Automatica 44(2), 504–511 (2008)
Bertsekas, D.: Convex Analysis and Optimization. Athena Scientific, Belmont (2003)
Owen, G.: Game Theory. Acadamic Press, New York (1982)
Basar, T., Bernhard, P.: \(H\infty \) Optimal Control and Related Minimax Design Problems. Birkhäuser, Boston (1995)
Yong, J.: Dynamic programming and Hamilton–Jacobi–Bellman equation. Shanghai Science Press, Shanghai (1991)
Padhi, R., Unnikrishnan, N., Wang, X., Balakrishman, S.: A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Netw. 19(10), 1648–1660 (2006)
Gupta, S.: Numerical Methods for Engineerings. Wiley Eastern Ltd. and New Age International Company, New Delhi (1995)
Si, J., Wang, Y.: On-line learning control by association and reinforcement. IEEE Trans. Neural Netw. 12(2), 264–275 (2001)
Enns, R., Si, J.: Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Trans. Neural Netw. 14(7), 929–939 (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Science Press, Beijing and Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Song, R., Wei, Q., Li, Q. (2019). An Iterative ADP Method to Solve for a Class of Nonlinear Zero-Sum Differential Games. In: Adaptive Dynamic Programming: Single and Multiple Controllers. Studies in Systems, Decision and Control, vol 166. Springer, Singapore. https://doi.org/10.1007/978-981-13-1712-5_10
Download citation
DOI: https://doi.org/10.1007/978-981-13-1712-5_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1711-8
Online ISBN: 978-981-13-1712-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)