Abstract
In this paper, the optimal control problem of nonzero-sum (NZS) games with partially unknown dynamics is investigated. The off-policy reinforcement learning (RL) method is proposed to approximate the solution of the coupled Hamilton-Jacobi (HJ) equations. A single critic network structure for each player is constructed using neural network (NN) technique. To improve the applicability of the off-policy RL method, the tuning laws of critic weights are designed based on the offline learning and online learning methods, respectively. The simulation study demonstrates the effectiveness of the proposed algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Friedman, A.: Differential Games. Courier Corporation, Mineola (2013)
Zhang, Q., Zhao, D., Zhu, Y.: Event-triggered \(H_\infty \) control for continuous-time nonlinear system via concurrent learning. IEEE Trans. Syst. Man Cybern. Syst. 47, 1071–1081 (2016). doi:10.1109/TSMC.2016.2531680
Zhang, Q., Zhao, D., Zhu, Y.: Event-triggered H8 control for continuous-time nonlinear system via concurrent learning. IEEE Trans. Syst. Man Cybern. Syst. 47(7), 1071–1081 (2017)
Starr, A.W., Ho, Y.C.: Nonzero-sum differential games. J. Optim. Theor. Appl. 3(3), 184–206 (1969)
Zhang, Q., Zhao, D., Zhu, Y.: Data-driven adaptive dynamic programming for continuous-time fully cooperative games with partially constrained inputs. Neurocomputing 238, 377–386 (2017)
Nash, J.: Non-cooperative games. Ann. Math. 54, 286–295 (1951)
Zhao, D., Zhang, Q., Wang, D., et al.: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics. IEEE Trans. Cybern. 46(3), 854–865 (2016)
Zhu, Y., Zhao, D., He, H., et al.: Event-triggered optimal control for partially unknown constrained-input systems via adaptive dynamic programming. IEEE Trans. Ind. Electron. 64(5), 4101–4109 (2017)
Vamvoudakis, K.G., Lewis, F.L.: Multi-player non-zero-sum games: online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica 47(8), 1556–1569 (2011)
Vrabie, D., Lewis, F.L.: Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Netw. 22(3), 237–246 (2009)
Kamalapurkar, R., Klotz, J.R., Dixon, W.E.: Concurrent learning-based approximate feedback-nash equilibrium solution of N-player nonzero-sum differential games. IEEE/CAA J. Automatica Sin. 1(3), 239–247 (2014)
Jiang, Y., Jiang, Z.: Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems. IEEE Trans. Circ. Syst. II Express Briefs 59(10), 693–697 (2012)
Wang, D., Liu, D., Zhang, Q., Zhao, D.: Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics. IEEE Trans. Syst. Man Cybern. Syst. 46(11), 1544–1555 (2016)
Mu, C., Ni, Z., Sun, C., He, H.: Data-driven tracking control with adaptive dynamic programming for a class of continuous-time nonlinear systems. IEEE Trans. Cybern. 47(6), 1460–1470 (2017)
Song, R., Lewis, F.L., Wei, Q.: Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 704–713 (2017)
Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for \(H_\infty \) control design. IEEE Trans. Cybern. 45(1), 65–76 (2015)
Acknowledgements
This research is supported by National Natural Science Foundation of China (NSFC) under Grants No. 61573353, No. 61533017, by the National Key Research and Development Plan under Grants 2016YFB0101000.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Zhang, Q., Zhao, D., Zhang, S. (2017). Off-Policy Reinforcement Learning for Partially Unknown Nonzero-Sum Games. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10634. Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_84
Download citation
DOI: https://doi.org/10.1007/978-3-319-70087-8_84
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70086-1
Online ISBN: 978-3-319-70087-8
eBook Packages: Computer ScienceComputer Science (R0)