Skip to main content

Discrete-Time Two-Player Zero-Sum Games for Nonlinear Systems Using Iterative Adaptive Dynamic Programming

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9719))

Abstract

This paper is concerned with a discrete-time two-player zero-sum game of nonlinear systems, which is solved by a new iterative adaptive dynamic programming (ADP) method. In the present iterative ADP algorithm, two iteration procedures, which are upper and lower iterations, are implemented to obtain the upper and lower performance index functions, respectively. Initialized by an arbitrary positive semi-definite function, it is shown that the iterative value functions converge to the optimal performance index function if the optimal performance index function of the two-player zero-sum game exists. Finally, simulation results are given to illustrate the performance of the developed method.

Q. Wei—This work was supported in part by the National Natural Science Foundation of China under Grants 61533017, 61374105, 61233001, 61304086, 61503379, and 61273140.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Basar, T., Bernhard, P.: \(H_\infty \)-Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach, 2nd edn. Birkhauser, Boston (1995)

    MATH  Google Scholar 

  2. Bellman, R.E.: Dynamic Programming. Princeton University Press, New Jersey (1957)

    MATH  Google Scholar 

  3. Berkel, K., Jager, B., Hofman, T., Steinbuch, M.: Implementation of dynamic programming for optimal control problems with continuous states. IEEE Trans. Control Syst. Technol. 23, 1172–1179 (2015)

    Article  Google Scholar 

  4. Jiang, Y., Jiang, Z.P.: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25, 882–893 (2014)

    Article  Google Scholar 

  5. Kiumarsi, B., Lewis, F.L.: Actor-critic based optimal tracking for partially unknown nonlinear discrete-time systems. IEEE Trans. Neural Netw. Learn. syst. 26, 140–151 (2014)

    Article  MathSciNet  Google Scholar 

  6. Lincoln, B., Rantzer, A.: Relaxing dynamic programming. IEEE Trans. Autom. Control 51, 1249–1260 (2006)

    Article  MathSciNet  Google Scholar 

  7. Ni, Z., He, H., Zhong, X., Prokhorov, D.V.: Model-free dual heuristic dynamic programming. IEEE Trans. Neural Netw. Learn. Syst. 26, 1834–1839 (2015)

    Article  MathSciNet  Google Scholar 

  8. Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25, 1733–1739 (2014)

    Article  Google Scholar 

  9. Song, R., Lewis, F.L., Wei, Q., Zhang, H., Jiang, Z.P., Levine, D.: Multiple actor-critic structures for continuous-time optimal control using input-output data. IEEE Trans. Neural Netw. Learn. Syst. 26, 851–865 (2015)

    Article  MathSciNet  Google Scholar 

  10. Song, R., Lewis, F. L., Wei, Q., & Zhang, H. Off-Policy Actor-Critic Structure for Optimal Control of Unknown Systems With Disturbances. IEEE Trans. Cybern. (2015) Article inpress. doi:10.1109/TCYB.2015.2421338

    Google Scholar 

  11. Wei, Q., Liu, D., Yang, X.: Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 26, 866–879 (2015)

    Article  MathSciNet  Google Scholar 

  12. Wei, Q., Wang, F., Liu, D., Yang, X.: Finite-approximation-error based discrete-time iterative adaptive dynamic programming. IEEE Trans. Cybern. 44, 2820–2833 (2014)

    Article  Google Scholar 

  13. Wei, Q., Liu, D.: Data-driven neuro-optimal temperature control of water gas shift reaction using stable iterative adaptive dynamic programming. IEEE Trans. Ind. Electron. 61, 6399–6408 (2014)

    Article  Google Scholar 

  14. Wei, Q., Liu, D., Shi, G., Liu, Y.: Optimal multi-battery coordination control for home energy management systems via distributed iterative adaptive dynamic programming. IEEE Trans. Ind. Electron. 42, 4203–4214 (2015)

    Article  Google Scholar 

  15. Wei, Q., Song, R., Yan, P.: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Trans. Neural Netw. Learn. Syst. 27, 444–458 (2016)

    Article  MathSciNet  Google Scholar 

  16. Wei, Q., Liu, D., Lin, H.: Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans. Cybern. 46, 840–853 (2016)

    Article  Google Scholar 

  17. Wei, Q., Liu, D.: A novel iterative \(\theta \)-adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans. Autom. Sci. Eng. 11, 1176–1190 (2014)

    Article  Google Scholar 

  18. Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11, 1020–1036 (2014)

    Article  Google Scholar 

  19. Wei, Q., Liu, D., Shi, G.: A novel dual iterative \(Q\)-learning method for optimal battery management in smart residential environments. IEEE Trans. Ind. Electron. 62, 2509–2518 (2015)

    Article  Google Scholar 

  20. Werbos, P.J.: Advanced forecasting methods for global crisis warning and models of intelligence. Gen. Syst. Yearb. 22, 25–38 (1977)

    Google Scholar 

  21. Werbos, P.J.: A menu of designs for reinforcement learning over time. In: Miller, W.T., Sutton, R.S., Werbos, P.J. (eds.) Neural Networks for Control, pp. 67–95. MIT Press, Cambridge (1991)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Qinglai Wei .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wei, Q., Liu, D. (2016). Discrete-Time Two-Player Zero-Sum Games for Nonlinear Systems Using Iterative Adaptive Dynamic Programming . In: Cheng, L., Liu, Q., Ronzhin, A. (eds) Advances in Neural Networks – ISNN 2016. ISNN 2016. Lecture Notes in Computer Science(), vol 9719. Springer, Cham. https://doi.org/10.1007/978-3-319-40663-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-40663-3_31

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-40662-6

  • Online ISBN: 978-3-319-40663-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics