Generalized Policy Iteration ADP for Discrete-Time Nonlinear Systems

Liu, Derong; Wei, Qinglai; Wang, Ding; Yang, Xiong; Li, Hongliang

doi:10.1007/978-3-319-50815-3_5

Generalized Policy Iteration ADP for Discrete-Time Nonlinear Systems

Derong Liu¹⁸,
Qinglai Wei¹⁸,
Ding Wang¹⁸,
Xiong Yang¹⁹ &
…
Hongliang Li²⁰

Chapter
First Online: 05 January 2017

3876 Accesses

Part of the book series: Advances in Industrial Control ((AIC))

Abstract

In this chapter , generalized policy iteration (GPI) algorithms are developed to solve infinite-horizon optimal control problems for discrete-time nonlinear systems. GPI algorithms use the idea of interacting policy iteration and value iteration algorithms of adaptive dynamic programming (ADP). They permit an arbitrary positive semidefinite function to initialize the algorithm, where two revolving iterations are used for policy evaluation and policy improvement, respectively. Then, the monotonicity, convergence, admissibility, and optimality properties of the present GPI algorithms for discrete-time nonlinear systems are analyzed. For implementation of the GPI algorithms, neural networks are employed for approximating the iterative value functions and computing the iterative control laws, respectively, to obtain the approximate optimal control law. Simulation examples are included to verify the effectiveness of the present algorithm.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791
Article MathSciNet MATH Google Scholar
Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern Part B Cybern 38(4):943–949
Article Google Scholar
Apostol TM (1974) Mathematical analysis, 2nd edn. Addison-Wesley, Boston
MATH Google Scholar
Beard RW (1995) Improving the closed-loop performance of nonlinear systems. Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, NY
Google Scholar
Bertsekas DP (2007) Dynamic programming and optimal control, 3rd edn. Athena Scientific, Belmont
MATH Google Scholar
Bertsekas DP, Tsitsiklis JN (1996) Neuro-dynamic programming. Athena Scientific, Belmont
MATH Google Scholar
Dorf RC, Bishop RH (2011) Modern control systems, 12th edn. Prentice-Hall, Upper Saddle River
MATH Google Scholar
Lewis FL, Vrabie D, Vamvoudakis KG (2012) Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Control Syst Mag 32(6):76–105
Article MathSciNet Google Scholar
Lincoln B, Rantzer A (2006) Relaxing dynamic programming. IEEE Trans Autom Control 51(8):1249–1260
Article MathSciNet Google Scholar
Liu D, Wei Q (2014) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans Neural Network Learn Syst 25(3):621–634
Article Google Scholar
Liu D, Wei Q, Yan P (2015) Generalized policy iteration adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans Syst Man Cybern Syst 45(12):1577–1591
Article Google Scholar
Rantzer A (2006) Relaxed dynamic programming in switching systems. IEE Proc Control Theory Appl 153(5):567–574
Article MathSciNet Google Scholar
Si J, Wang YT (2001) Online learning control by association and reinforcement. IEEE Trans Neural Network 12(2):264–276
Article MathSciNet Google Scholar
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge
Google Scholar
Vrabie D, Vamvoudakis KG, Lewis FL (2013) Optimal adaptive control and differential games by reinforcement learning principles. IET, London
MATH Google Scholar
Wei Q, Liu D (2014) A novel iterative \(\theta \)-adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans Autom Sci Eng 11(4):1176–1190
Article Google Scholar
Wei Q, Liu D, Lin H (2016) Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern 46(3):840–853
Article Google Scholar
Wei Q, Liu D, Yang X (2015) Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems. IEEE Trans Neural Network Learn Syst 26(4):866–879
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Automation, Chinese Academy of Sciences, Beijing, China
Derong Liu, Qinglai Wei & Ding Wang
Tianjin University, Tianjin, China
Xiong Yang
Tencent Inc., Shenzhen, China
Hongliang Li

Authors

Derong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Qinglai Wei
View author publications
You can also search for this author in PubMed Google Scholar
Ding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiong Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongliang Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Derong Liu .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Liu, D., Wei, Q., Wang, D., Yang, X., Li, H. (2017). Generalized Policy Iteration ADP for Discrete-Time Nonlinear Systems. In: Adaptive Dynamic Programming with Applications in Optimal Control. Advances in Industrial Control. Springer, Cham. https://doi.org/10.1007/978-3-319-50815-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-50815-3_5
Published: 05 January 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50813-9
Online ISBN: 978-3-319-50815-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics