Skip to main content

Value Iteration ADP for Discrete-Time Nonlinear Systems

  • Chapter
  • First Online:

Part of the book series: Advances in Industrial Control ((AIC))

Abstract

In this chapter, optimal control problems of discrete-time nonlinear systems, including optimal regulation, optimal tracking control, and constrained optimal control, are studied by using a series of value iteration (VI) adaptive dynamic programming (ADP) approaches. First, an ADP scheme based on general value iteration (GVI) is developed to obtain near optimal control for discrete-time affine nonlinear systems. Then, the GVI-based ADP algorithm is employed to solve the infinite-horizon optimal tracking control problem for a class of discrete-time nonlinear systems. Moreover, using the globalized dual heuristic programming technique, the VI-based optimal control strategy of unknown discrete-time nonlinear systems with input constraints is established as a special case. Finally, an iterative \(\theta \)-ADP algorithm is developed to solve the optimal control problem of infinite-horizon discrete-time nonlinear systems, which shows that each of the iterative controls can stabilize the nonlinear system and the condition of initial admissible control is avoided effectively. Simulation examples are included to verify the effectiveness of the present control strategies.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Abu-Khalaf M, Lewis FL (2005) Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5):779–791

    Article  MathSciNet  MATH  Google Scholar 

  2. Abu-Khalaf M, Lewis FL, Huang J (2008) Neurodynamic programming and zero-sum games for constrained control systems. IEEE Trans Neural Netw 19(7):1243–1252

    Article  Google Scholar 

  3. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2007) Adaptive critic designs for discrete-time zero-sum games with application to \(H_\infty \) control. IEEE Trans Syst Man Cybern.-Part B: Cybern 37(1):240–247

    Article  MATH  Google Scholar 

  4. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2007) Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica 43(3):473–481

    Article  MathSciNet  MATH  Google Scholar 

  5. Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man Cybern-Part B: Cybern 38(4):943–949

    Article  Google Scholar 

  6. Apostol TM (1974) Mathematical analysis: A modern approach to advanced calculus. Addison-Wesley, Boston, MA

    Google Scholar 

  7. Athans M, Falb PL (1966) Optimal control: an introduction to the theory and its applications. McGraw-Hill, New York

    MATH  Google Scholar 

  8. Beard R, Saridis G, Wen J (1997) Galerkin approximations of the generalized Hamilton–Jacobi–Bellman equation. Automatica 33(12):2158–2177

    Article  MathSciNet  MATH  Google Scholar 

  9. Bellman RE (1957) Dynamic programming. Princeton University Press, Princeton, NJ

    MATH  Google Scholar 

  10. Berkovitz LD, Medhin NG (2013) Nonlinear optimal control theory. CRC Press, Boca Raton, FL

    MATH  Google Scholar 

  11. Bertsekas DP (2005) Dynamic programming and optimal control. Athena Scientific, Belmont, MA

    MATH  Google Scholar 

  12. Bitmead RR, Gever M, Petersen IR (1985) Monotonicity and stabilizability properties of solutions of the Riccati difference equation: Propositions, lemmas, theorems, fallacious conjectures and counterexamples. Syst Control Lett 5:309–315

    Article  MathSciNet  MATH  Google Scholar 

  13. Dierks T, Thumati BT, Jagannathan S (2009) Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw 22(5):851–860

    Article  MATH  Google Scholar 

  14. Dreyfus SE, Law AM (1977) The art and theory of dynamic programming. Academic Press, New York

    MATH  Google Scholar 

  15. Fu J, He H, Zhou X (2011) Adaptive learning and control for MIMO system based on adaptive dynamic programming. IEEE Trans Neural Netw 22(7):1133–1148

    Article  Google Scholar 

  16. Hagan MT, Menhaj MB (1994) Training feedforward networks with the Marquardt algorithm. IEEE Trans Neural Netw 5(6):989–993

    Article  Google Scholar 

  17. Heydari A, Balakrishnan SN (2013) Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Trans Neural Netw Learn Syst 24(1):145–157

    Article  Google Scholar 

  18. Howard RA (1960) Dynamic programming and Markov processes. MIT Press, Cambridge, MA

    MATH  Google Scholar 

  19. Huang Y, Liu D (2014) Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm. Neurocomputing 125:46–56

    Article  Google Scholar 

  20. Koppel LB (1968) Introduction to control theory with applications to process control. Prentice-Hall, Englewood Cliffs, NJ

    Google Scholar 

  21. Levin AU, Narendra KS (1993) Control of nonlinear dynamical systems using neural networks: controllability and stabilization. IEEE Trans Neural Netw 4(2):192–206

    Article  Google Scholar 

  22. Lewis FL, Liu D (2012) Reinforcement learning and approximate dynamic programming for feedback control. Wiley, Hoboken, NJ

    Book  Google Scholar 

  23. Lewis FL, Syrmos VL (1995) Optimal control. Wiley, New York

    Google Scholar 

  24. Lewis FL, Vrabie D (2009) Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst Mag 9(3):32–50

    Article  MathSciNet  Google Scholar 

  25. Li H, Liu D (2012) Optimal control for discrete-time affine non-linear systems using general value iteration. IET Control Theory Appl 6(18):2725–2736

    Article  MathSciNet  Google Scholar 

  26. Liao X, Wang L, Yu P (2007) Stability of dynamical systems. Elsevier, Amsterdam, Netherlands

    Book  MATH  Google Scholar 

  27. Lincoln B, Rantzer A (2006) Relaxing dynamic programming. IEEE Trans Autom Control 51(8):1249–1260

    Article  MathSciNet  Google Scholar 

  28. Liu D, Wang D, Yang X (2013) An iterative adaptive dynamic programming algorithm for optimal control of unknown discrete-time nonlinear systems with constrained inputs. Inf Sci 220:331–342

    Article  MathSciNet  MATH  Google Scholar 

  29. Lyshevski SE (1998) Optimal control of nonlinear continuous-time systems: design of bounded controllers via generalized nonquadratic functionals. In: Proceedings of the American control conference. pp 205–209

    Google Scholar 

  30. Michel AN, Hou L, Liu D (2015) Stability of dynamical systems: On the role of monotonic and non-monotonic Lyapunov functions. Birkhäuser, Boston, MA

    Book  MATH  Google Scholar 

  31. Murray JJ, Cox CJ, Lendaris GG, Saeks R (2002) Adaptive dynamic programming. IEEE Trans Syst Man Cybern-Part C: Appl Rev 32(2):140–153

    Article  Google Scholar 

  32. Navarro-Lopez EM (2007) Local feedback passivation of nonlinear discrete-time systems through the speed-gradient algorithm. Automatica 43(7):1302–1306

    Article  MathSciNet  MATH  Google Scholar 

  33. Primbs JA, Nevistic V (2000) Feasibility and stability of constrained finite receding horizon control. Automatica 36(7):965–971

    Article  MathSciNet  MATH  Google Scholar 

  34. Prokhorov DV, Wunsch DC (1997) Adaptive critic designs. IEEE Trans Neural Netw 8(5):997–1007

    Article  Google Scholar 

  35. Rantzer A (2006) Relaxed dynamic programming in switching systems. IEE Proc-Control Theory Appl 153(5):567–574

    Article  MathSciNet  Google Scholar 

  36. Si J, Wang YT (2001) On-line learning control by association and reinforcement. IEEE Trans Neural Netw 12(2):264–276

    Article  MathSciNet  Google Scholar 

  37. Sira-Ramirez H (1991) Non-linear discrete variable structure systems in quasi-sliding mode. Int J Control 54(5):1171–1187

    Article  MathSciNet  MATH  Google Scholar 

  38. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press, Cambridge, MA

    Google Scholar 

  39. Vincent TL, Grantham WJ (1997) Nonlinear and optimal control systems. Wiley, New York

    Google Scholar 

  40. Vrabie D, Vamvoudakis KG, Lewis FL (2013) Optimal adaptive control and differential games by reinforcement learning principles. IET, London

    MATH  Google Scholar 

  41. Wang D, Liu D (2013) Neuro-optimal control for a class of unknown nonlinear dynamic systems using SN-DHP technique. Neurocomputing 121:218–225

    Article  Google Scholar 

  42. Wang D, Liu D, Wei Q, Zhao D, Jin N (2012) Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica 48(8):1825–1832

    Article  MathSciNet  MATH  Google Scholar 

  43. Wang FY, Jin N, Liu D, Wei Q (2011) Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with \(\epsilon \)-error bound. IEEE Trans Neural Netw 22(1):24–36

    Article  Google Scholar 

  44. Wei Q, Liu D (2014) A novel iterative \(\theta \)-adaptive dynamic programming for discrete-time nonlinear systems. IEEE Trans Autom Sci Eng 11(4):1176–1190

    Article  Google Scholar 

  45. Wei Q, Liu D, Xu Y (2014) Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming. Soft Comput 20(2):697–706

    Article  Google Scholar 

  46. Werbos PJ (1977) Advanced forecasting methods for global crisis warning and models of intelligence. Gen Syst Yearbook 22:25–38

    Google Scholar 

  47. Werbos PJ (1992) Approximate dynamic programming for real-time control and neural modeling. In: White DA, Sofge DA (eds) Handbook of intelligent control: neural, fuzzy, and adaptive approaches (Chapter 13). Van Nostrand Reinhold, New York

    Google Scholar 

  48. Yang Q, Jagannathan S (2012) Reinforcement learning controller design for affine nonlinear discrete-time systems using online approximators. IEEE Trans Syst Man Cybern-Part B: Cybern 42(2):377–390

    Article  Google Scholar 

  49. Zhang H, Huang J, Lewis FL (2009) An improved method in receding horizon control with updating of terminal cost function. In: Valavanis KP (ed) Applications of intelligent control to engineering systems. Springer, New York, pp 365–393

    Chapter  Google Scholar 

  50. Zhang H, Liu D, Luo Y, Wang D (2013) Adaptive dynamic programming for control: algorithms and stability. Springer, London

    Book  MATH  Google Scholar 

  51. Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20(9):1490–1503

    Article  Google Scholar 

  52. Zhang H, Wei Q, Luo Y (2008) A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Trans Syst Man Cybern-Part B: Cybern 38(4):937–942

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Derong Liu .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Liu, D., Wei, Q., Wang, D., Yang, X., Li, H. (2017). Value Iteration ADP for Discrete-Time Nonlinear Systems. In: Adaptive Dynamic Programming with Applications in Optimal Control. Advances in Industrial Control. Springer, Cham. https://doi.org/10.1007/978-3-319-50815-3_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50815-3_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50813-9

  • Online ISBN: 978-3-319-50815-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics