Advertisement

Optimal Tracking Control Scheme for Discrete-Time Nonlinear Systems with Approximation Errors

  • Qinglai Wei
  • Derong Liu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7952)

Abstract

In this paper, we aim to solve an infinite-time optimal tracking control problem for a class of discrete-time nonlinear systems using iterative adaptive dynamic programming (ADP) algorithm. When the iterative tracking control law and the iterative performance index function in each iteration cannot be accurately obtained, a new convergence analysis method is developed to obtain the convergence conditions of the iterative ADP algorithm according to the properties of the finite approximation errors. If the convergence conditions are satisfied, it is shown that the iterative performance index functions converge to a finite neighborhood of the greatest lower bound of all performance index functions under some mild assumptions. Neural networks are used to approximate the performance index function and compute the optimal tracking control policy, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, a simulation example is given to illustrate the performance of the present method.

Keywords

Adaptive dynamic programming generalized value iteration neural networks optimal control reinforcement learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5), 779–791 (2005)MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    Abu-Khalaf, M., Lewis, F.L., Huang, J.: Neurodynamic programming and zero-sum games for constrained control systems. IEEE Transactions on Neural Networks 19(7), 1243–1252 (2008)CrossRefGoogle Scholar
  3. 3.
    Al-Tamimi, A., Abu-Khalaf, M., Lewis, F.L.: Adaptive critic designs for discrete-time zero-sum games with application to H  ∞  control. IEEE Trans. Systems, Man, and Cybernetics-Part B: Cybernetics 37(7), 240–247 (2007)CrossRefGoogle Scholar
  4. 4.
    Prokhorov, D.V., Wunsch, D.C.: Adaptive critic designs. IEEE Transactions on Neural Networks 8(5), 997–1007 (1997)CrossRefGoogle Scholar
  5. 5.
    Hao, X., Jagannathan, S.: Model-free H  ∞  stochastic optimal design for unknown linear networked control system zero-sum games via Q-learning. In: 2011 IEEE International Symposium on Intelligent Control (ISIC), Singapore, pp. 198–203 (2011)Google Scholar
  6. 6.
    Liu, D., Zhang, Y., Zhang, H.: A self-learning call admission control scheme for CDMA cellular networks. IEEE Transactions on Neural Networks 16(5), 1219–1228 (2005)CrossRefGoogle Scholar
  7. 7.
    Tan, F., Liu, D., Guan, X., Xing, S.: Trajectory tracking control of nonholonomic mobile robot system based on unfalsified control theory. Control and Decision 25(6), 1693–1697 (2010)Google Scholar
  8. 8.
    Wang, F., Jin, N., Liu, D., Wei, Q.: Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with ε-error bound. IEEE Transactions on Neural Networks 22(1), 24–36 (2011)CrossRefGoogle Scholar
  9. 9.
    Wang, D., Liu, D., Wei, Q.: Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing 78(1), 14–22 (2012)CrossRefGoogle Scholar
  10. 10.
    Wei, Q., Liu, D.: Nonlinear multi-person zero-sum differential games using iterative adaptive dynamic programming. In: 30th Chinese Control Conference (CCC), Yantai, China, pp. 2456–2461 (2011)Google Scholar
  11. 11.
    Wei, Q., Liu, D.: An iterative ε-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks 32, 236–244 (2012)zbMATHCrossRefGoogle Scholar
  12. 12.
    Wei, Q., Zhang, H., Dai, J.: Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing 72(7-9), 1839–1848 (2009)CrossRefGoogle Scholar
  13. 13.
    Werbos, P.J.: Advanced forecasting methods for global crisis warning and models of intelligence. General Systems Yearbook 22, 25–38 (1977)Google Scholar
  14. 14.
    Werbos, P.J.: A menu of designs for reinforcement learning over time. In: Miller, W.T., Sutton, R.S., Werbos, P.J. (eds.) Neural Networks for Control, pp. 67–95. MIT Press, Cambridge (1991)Google Scholar
  15. 15.
    Zhang, H., Wei, Q., Luo, Y.: A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the greedy HDP iteration algorithm. IEEE Transactions on System, Man, and cybernetics-Part B: Cybernetics 38(4), 937–942 (2008)CrossRefGoogle Scholar
  16. 16.
    Zhang, H., Song, R., Wei, Q.: Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Transactions on Neural Networks 22(12), 1851–1862 (2011)CrossRefGoogle Scholar
  17. 17.
    Zhang, H., Wei, Q., Liu, D.: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games. Automatica 47(1), 207–214 (2011)MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Qinglai Wei
    • 1
  • Derong Liu
    • 1
  1. 1.The State Key Laboratory of Management and Control for Complex Systems, Institute of AutomationChinese Academy of SciencesBeijingChina

Personalised recommendations