Advertisement

Adaptive Dynamic Programming for Minimal Energy Control with Guaranteed Convergence Rate of Linear Systems

  • Kai Zhang
  • Suoliang GeEmail author
  • Yuling Ge
Regular Papers
  • 37 Downloads

Abstract

The traditional linear quadratic optimal control can be summarized as finding the state feedback controller, so that the closed-loop system is stable and the performance index is minimum. And it is well known that the solution of the linear quadratic optimal control problem can be obtained by algebraic Riccati equation (ARE) with the standard assumptions. However, results developed for the traditional linear quadratic optimal control problem cannot be directly applied to solve the problem of minimal energy control with guaranteed convergence rate (MECGCR), because the standard assumptions cannot be satisfied in the MECGCR problem. In this paper, we mainly consider the problem of MECGCR and prove that ARE can be applied to solve the MECGCR problem under some conditions. Furthermore, with the assumption that the system dynamics is unknown, we propose a policy iteration (PI) based adaptive dynamic programming (ADP) algorithm to iteratively solve the ARE using the online information of state and input, without requiring the a priori knowledge of the system matrices. Finally, a numerical example is worked out to show the effectiveness of the proposed approach.

Keywords

Adaptive dynamic programming guaranteed convergence rate minimal energy control policy iteration 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

References

  1. [1]
    B. Zhou and Z. Lin, “Truncated predictor feedback stabilization of polynomially unstable linear systems with multiple time-varying input delays,” IEEE Transactions on Automatic Control, vol. 59, no. 8, pp. 2157–2163, August 2014.MathSciNetzbMATHCrossRefGoogle Scholar
  2. [2]
    C. Xia, N. Liu, Z. Zhou, Y. Yan, and T. Shi, “Steady-state performance improvement for LQR-based PMSM drives,” IEEE Transactions on Power Electronics, vol. 33, no. 12, pp. 10622–10632, December 2018.CrossRefGoogle Scholar
  3. [3]
    B. Zhou and Z. Li, “Truncated predictor feedback for periodic linear systems with input delays with applications to the elliptical spacecraft rendezvous,” IEEE Transactions on Control Systems Technology, vol. 23, no. 6, pp. 2238–2250, November 2015.MathSciNetCrossRefGoogle Scholar
  4. [4]
    H. Sun, Y. Liu, F. Li, and X. Niu, “Distributed LQR optimal protocol for leader-following consensus,” IEEE Transactions on Cybernetics, vol. 49, no. 9, pp. 3532–3546, September 2019.CrossRefGoogle Scholar
  5. [5]
    F. L. Lewis, D. Vrabie, Optimal Control, 3rd Edition, John Wiley & Sons, Inc., 2013.zbMATHGoogle Scholar
  6. [6]
    D. Kleinman, “On an iterative technique for Riccati equation computations,” IEEE Transactions on Automatic Control, vol. 13, no. 1, pp. 114–115, February 1968.CrossRefGoogle Scholar
  7. [7]
    G. Tao, Adaptive Control Design and Analysis, Wiley-IEEE Press, 2003.zbMATHCrossRefGoogle Scholar
  8. [8]
    I. Mareels, J. Polderman, “Adaptive systems,” Systems & Control Foundations & Applications, vol. 12, no. 1, pp. 1–26, 1996.zbMATHGoogle Scholar
  9. [9]
    Y. Jiang, J. Fan, T. Chai, J. Li, and F. L. Lewis, “Data-driven flotation industrial process operational optimal control based on reinforcement learning,” IEEE Transactions on Industrial Informatics, vol. 14, no. 5, pp. 1974–1989, May 2018.CrossRefGoogle Scholar
  10. [10]
    H. Zhang, Y. Liu, G. Xiao, and H. Jiang, “Data-based adaptive dynamic programming for a class of discrete-time systems with multiple delays,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017. DOI: 10.1109/TSMC.2017.2758849Google Scholar
  11. [11]
    C. Li, D. Liu, and D. Wang, “Data-based optimal control for weakly coupled nonlinear systems using policy iteration,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 48, no. 4, pp. 511–521, April 2018.CrossRefGoogle Scholar
  12. [12]
    B. Luo, Y. Yang, and D. Liu, “Adaptive Q learning for data-based optimal output regulation with experience replay,” IEEE Transactions on Cybernetics, vol. 48, no. 12, pp. 3337–3348, December 2018.CrossRefGoogle Scholar
  13. [13]
    S. Zuo, Y. Song, F. L. Lewis, and A. Davoudi, “Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning,” IEEE Transactions on Cybernetics, vol. 48, no. 11, pp. 3197–3207, November 2018.CrossRefGoogle Scholar
  14. [14]
    D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, “Adaptive optimal control for continuous-time linear systems based on policy iteration,” Automatica, vol. 45, no. 2, pp. 477–484, February 2009.MathSciNetzbMATHCrossRefGoogle Scholar
  15. [15]
    H. Modares and F. L. Lewis, “Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning,” IEEE Transactions on Automatic Control, vol. 59, no. 11, pp. 3051–3056, November 2014.MathSciNetzbMATHCrossRefGoogle Scholar
  16. [16]
    H. Wu and B. Luo, “Simultaneous policy update algorithms for learning the solution of linear continuous-time H state feedback control,” Information Sciences, vol. 222, no. 10, pp. 472–485, February 2013.MathSciNetzbMATHCrossRefGoogle Scholar
  17. [17]
    Y. Jiang and Z. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, October 2012.MathSciNetzbMATHCrossRefGoogle Scholar
  18. [18]
    H. Zhang, J. Zhang, G. Yang, and Y. Luo, “Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming,” IEEE Transactions on Fuzzy Systems, vol. 23, no. 1, pp. 152–163, February 2015.CrossRefGoogle Scholar
  19. [19]
    W. Gao, Z. Jiang, F. L. Lewis, and Y. Wang, “Leader-to-formation stability of multiagent systems: an adaptive optimal control approach,” IEEE Transactions on Automatic Control, vol. 63, no. 10, pp. 3581–3587, October 2018.MathSciNetzbMATHCrossRefGoogle Scholar
  20. [20]
    Y. Jiang, B. Kiumarsi, J. Fan, T. Chai, J. Li, and F. L. Lewis, “Optimal output regulation of linear discrete-time systems with unknown dynamics using reinforcement learning,” IEEE Transactions on Cybernetics, 2019. DOI: 10.1109/TCYB.2018.2890046Google Scholar
  21. [21]
    S. He, J. Song, Z. Ding, and F. Liu, “Online adaptive optimal control for continuous-time Markov jump linear systems using a novel policy iteration algorithm,” IET Control Theory & Applications, vol. 9, no. 10, pp. 1536–1543, 2015.MathSciNetCrossRefGoogle Scholar
  22. [22]
    Y. Fu, J. Fu, and T. Chai, “Robust adaptive dynamic programming of two-player zero-sum games for continuous-time linear systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 12, pp. 3314–3319, December 2015.MathSciNetCrossRefGoogle Scholar
  23. [23]
    H. Li, D. Liu, and D. Wang, “Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics,” IEEE Transactions on Automation Science and Engineering, vol. 11, no. 3, pp. 706–714, July 2014.CrossRefGoogle Scholar
  24. [24]
    T. Kaczorek, “Minimum energy control of positive fractional descriptor continuous-time linear systems,” IET Control Theory & Applications, vol. 8, no. 4, pp. 219–225, Mar 2014.MathSciNetCrossRefGoogle Scholar
  25. [25]
    T. Kaczorek, “Minimum energy control of fractional positive electrical circuits with bounded inputs,” Circuits Systems & Signal Processing, vol. 65, no. 2, pp. 191–201, Mar 2014.Google Scholar
  26. [26]
    J. L. Willems, “Minimum energy and maximum accuracy optimal control of linear stochastic systems,” International Journal of Control, vol. 22, no.1 pp. 103–112, 1975.MathSciNetzbMATHCrossRefGoogle Scholar
  27. [27]
    M. Stocks and A. Medvedev, “Guaranteed convergence rate for linear-quadratic optimal time-varying observers,” Proceedings of the 45th IEEE Conference on Decision and Control, pp. 1653–1658, 2006.CrossRefGoogle Scholar
  28. [28]
    C. T. Chen, Linear System Theory and Design, Holt, Rinehart, and Winston, 1984.Google Scholar
  29. [29]
    B. Anderson and J. Moore, “Linear system optimization with prescribed degree of stability,” Proceedings of the Institution of Electrical Engineers, vol. 116, no.12, pp. 2083–2087, 1969.CrossRefGoogle Scholar
  30. [30]
    K. Zhang and S. Ge, “Adaptive optimal control with guaranteed convergence rate for continuous-time linear systems with completely unknown dynamics,” IEEE Access, vol. 7, pp. 11526–11532, 2019.CrossRefGoogle Scholar
  31. [31]
    B. Zhou, G. Duan, and Z. Lin, “A parametric Lyapunov equation approach to the design of low gain feedback,” IEEE Transactions on Automatic Control, vol. 53, no. 6, pp. 1548–1554, July 2008.MathSciNetzbMATHCrossRefGoogle Scholar
  32. [32]
    B. Zhou and G. Duan, “Periodic Lyapunov equation based approaches to the stabilization of continuous-time periodic linear systems,” IEEE Transactions on Automatic Control, vol. 57, no. 8, pp. 2139–2146, August 2012.MathSciNetzbMATHCrossRefGoogle Scholar
  33. [33]
    B. Zhou, G. Duan, and Z. Lin, “Approximation and monotonicity of the maximal invariant ellipsoid for discrete-time systems by bounded controls,” IEEE Transactions on Automatic Control, vol. 55, no. 2, pp. 440–446, February 2010.MathSciNetzbMATHCrossRefGoogle Scholar
  34. [34]
    A. J. Laub, Matrix Analysis for Scientists and Engineers, Society for Industrial and Applied Mathematics, 2004.zbMATHGoogle Scholar
  35. [35]
    P. A. Ioannou and J. Sun, Robust Adaptive Control, Prentice-Hall, Inc., 1995.zbMATHGoogle Scholar
  36. [36]
    I. Mareels and J. W. Polderman, Adaptive Systems: An Introduction, DBLP, 1996.zbMATHCrossRefGoogle Scholar
  37. [37]
    Y. Jiang and Z. Jiang, Robust Adaptive Dynamic Programming. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, John Wiley & Sons, Inc., 2013.Google Scholar
  38. [38]
    W. Gao and Z. Jiang, “Learning-based adaptive optimal tracking control of strict-feedback nonlinear systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2614–2624, June 2018.MathSciNetCrossRefGoogle Scholar
  39. [39]
    J. Fan, Z. Li, Y. Jiang, T. Chai, and F. L. Lewis, “Model-free linear discrete-time system H control using input-output data,” Proc. of International Conference on Advanced Mechatronic Systems, pp. 207–212, 2018.Google Scholar

Copyright information

© ICROS, KIEE and Springer 2019

Authors and Affiliations

  1. 1.School of Electrical Engineering and AutomationHefei University of TechnologyHefei, AnhuiChina

Personalised recommendations