Advertisement

Model-free Optimal Tracking Control for an Aircraft Skin Inspection Robot with Constrained-input and Input Time-delay via Integral Reinforcement Learning

  • Xuewei Wu
  • Congqing WangEmail author
Article
  • 4 Downloads

Abstract

This paper presents a model-free optimal tracking control algorithm for an aircraft skin inspection robot with constrained-input and input time-delay. To tackle the input time-delay problem, the original system is transformed into a delay-free system with constrained-input and unknown input coupling term. In order to overcome the optimal control problem subject to constrained-input, a discounted value function is employed. In general, it is known that the HJB equation does not admit a classical smooth solution. Moreover, since the input coupling term of the delay-free system is unknown, a model-free integral reinforcement learning(IRL) algorithm which only requires the system sampling data generated by arbitrary different control inputs and external disturbances is proposed. The model-free IRL method is implemented on an actor-critic neural network (NN) structure. A system sampling data set is utilized to learn the value function and control policy. Finally, the simulation verifies the effectiveness of the proposed algorithm.

Keywords

Constrained-input input time-delay model-free reinforcement learning 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

References

  1. [1]
    T. Sasaki, G. Enriquez, T. Miwa, and S. Hashimoto, “Adaptive path planning for cleaning robots considering dust distribution,” Journal of Robotics and Mechatronics, vol. 30, no. 1, pp. 5–14, February 2018.CrossRefGoogle Scholar
  2. [2]
    M. Homayounzade and A. Khademhosseini. “Disturbance observer-based trajectory following control of robot manipulators,” International Journal of Control, Automation and Systems, vol. 17, no. 1, pp. 203–211, January 2019.CrossRefGoogle Scholar
  3. [3]
    Z. Liu, P. Smith, T. Park, A. A. Trindade, and Q. Hui, “Automated contaminant source localization in spatiotemporal fields: a response surface and experimental design approach,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 47, no. 3, pp. 569–583, February 2017.CrossRefGoogle Scholar
  4. [4]
    J. J. Jiang and C. Q. Wang, “Dynamics modelling and backstepping motion control of the aircraft skin inspection robot,” Computer Modeling In Engineering & Sciences, CMES, vol. 120, no. 1, pp. 105–121, June 2019.CrossRefGoogle Scholar
  5. [5]
    R. Sanz, P. Garcia, Q.-C. Zhong, and P. Albertos, “Predictor-based control of a class of time-delay systems and its application to quadrotors,” IEEE Transactions on Industrial Electronics, vol. 64, no. 1, pp. 459–469, January 2017.CrossRefGoogle Scholar
  6. [6]
    B. Zhou and Z. Li, “Truncated predictor feedback for periodic linear systems with input delays with applications to the elliptical spacecraft rendezvous,” IEEE Transactions on Control Systems Technology, vol. 23, no. 6, pp. 2238–2250, November 2015.MathSciNetCrossRefGoogle Scholar
  7. [7]
    Z. Zuo, Z. Lin, and Z. Ding, “Truncated predictor control of Lipschitz nonlinear systems with time-varying input delay,” IEEE Transactions on Automatic Control, vol. 62, no. 10, pp. 5324–5330, Octomber 2017.MathSciNetzbMATHCrossRefGoogle Scholar
  8. [8]
    Z. Zheng, Y. Huang, L. Xie, and B. Zhu, “Adaptive trajectory tracking control of a fully actuated surface vessel with asymmetrically constrained input and output,” IEEE Transactions on Control Systems Technology, vol. 26, no. 5, pp. 1851–1859, September 2018.CrossRefGoogle Scholar
  9. [9]
    G. Q. Wu, S. M. Song, and J. G. Sun, “Adaptive dynamic surface control for spacecraft terminal safe approach with input saturation based on tracking differentiator,” International Journal of Control, Automation and Systems, vol. 16, no. 3, pp. 1129–1141, May 2018.CrossRefGoogle Scholar
  10. [10]
    H. Min, S. Xu, B. Zhang, and Q. Ma, “Output-feedback control for stochastic nonlinear systems subject to input saturation and time-varying delay,” IEEE Transactions on Automatic Control, vol. 64, no. 1, pp. 359–364, April 2019.MathSciNetzbMATHCrossRefGoogle Scholar
  11. [11]
    G. Lai, Z. Liu, Y. Zhang, C. P. Chen, and S. Xie, “Adaptive backstepping-based tracking control of a class of uncertain switched nonlinear systems,” Automatica, vol. 91, pp. 301–310, April 2018.MathSciNetzbMATHCrossRefGoogle Scholar
  12. [12]
    J. Sun and C. Liu, “Distributed fuzzy adaptive back-stepping optimal control for nonlinear multimissile guidance systems with input saturation,” IEEE Transactions on Fuzzy Systems, vol. 27, no. 3, pp. 447–461, July 2019.Google Scholar
  13. [13]
    N. T. Binh, N. A. Tung, D. P. Nam, and N. H. Quang, “An adaptive backstepping trajectory tracking control of a tractor trailer wheeled mobile robot,” International Journal of Control, Automation and Systems, vol. 17, no. 2, pp. 465–473, January 2019.CrossRefGoogle Scholar
  14. [14]
    V. Mien, M. Mavrovouniotis, and S. Ge, “An adaptive backstepping nonsingular fast terminal sliding mode control for robust fault tolerant control of robot manipulators,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 99, pp. 1–11, January 2018.Google Scholar
  15. [15]
    S. Mobayen, “Adaptive global terminal sliding mode control scheme with improved dynamic surface for uncertain nonlinear systems,” International Journal of Control, Automation and Systems, vol. 16, no. 4, pp. 1692–1700, July 2018.CrossRefGoogle Scholar
  16. [16]
    M. Omid and S. Mobayen, “Adaptive sliding mode control for finite-time stability of quad-rotor UAVs with parametric uncertainties,” ISA transactions, vol. 72, pp. 1–14, January 2018.CrossRefGoogle Scholar
  17. [17]
    Y. Li, K. Sun, and S. Tong, “Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems,” IEEE Transactions on Cybernetics, vol. 99, pp. 1–13, Feburary 2018.Google Scholar
  18. [18]
    Y. Wu and G. Li, “Adaptive disturbance compensation finite control set optimal control for PMSM systems based on sliding mode extended state observer,” Mechanical Systems and Signal Processing, vol. 98, pp. 402–414, Janurary 2018.CrossRefGoogle Scholar
  19. [19]
    J. Willems, “Least squares stationary optimal control and the algebraic Riccati equation,” IEEE Transactions on Automatic Control, vol. 16, no. 6, pp. 621–634, December 1971.MathSciNetCrossRefGoogle Scholar
  20. [20]
    R. Becker and R. Rannacher, “An optimal control approach to a posteriori error estimation in finite element methods.” Acta Numerica, vol. 10, pp. 1–102, January 2001.MathSciNetzbMATHCrossRefGoogle Scholar
  21. [21]
    A. Al-Tamimi, F. L. Lewis, and M. Abu-Khalaf, “Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof,” IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, vol. 38, no. 4, pp. 943–949, June 2008.CrossRefGoogle Scholar
  22. [22]
    Q. Wei, D. Liu, and H. Lin, “Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems,” IEEE Transactions on Cybernetics, vol. 46, no. 3, pp. 840–853, March 2016.CrossRefGoogle Scholar
  23. [23]
    D. Liu and Q. Wei, “Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 3, pp. 621–634, March 2014.CrossRefGoogle Scholar
  24. [24]
    T. Y. Chun, J. B. Park, and Y. H. Choi, “Reinforcement Q-learning based on multirate generalized policy iteration and its application to a 2-DOF helicopter,” International Journal of Control, Automation and Systems, vol. 16, no. 1, pp. 377–386, March 2018.CrossRefGoogle Scholar
  25. [25]
    H. Jiang, H. Zhang, Y. Cui, and G. Xiao, “Robust control scheme for a class of uncertain nonlinear systems with completely unknown dynamics using data-driven reinforcement learning method,” Neurocomputing, vol. 273, pp. 68–77, January 2018.CrossRefGoogle Scholar
  26. [26]
    H. Wu, S. Song, K. You, and C. Wu, “Depth control of model-free AUVs via reinforcement learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 99, pp. 1–12, January 2018.Google Scholar
  27. [27]
    J. Hou, D. Wang, D. Liu, and Y. Zhang, “Model-free H optimal tracking control of constrained nonlinear systems via an iterative adaptive learning algorithm,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 99, pp. 1–12, August 2018.Google Scholar
  28. [28]
    M. Hamidreza, F. L. Lewis, and M. Naghibi-Sistani, “Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems,” Automatica, vol. 50, no. 1, pp. 193–202, January 2014.MathSciNetzbMATHCrossRefGoogle Scholar
  29. [29]
    C. Liu, H. Zhang, G. Xiao, and S. Sun, “Integral reinforcement learning based decentralized optimal tracking control of unknown nonlinear large-scale interconnected systems with constrained-input,” Neurocomputing, vol. 323, pp. 1–11, January 2019.CrossRefGoogle Scholar
  30. [30]
    X. Yang, D. Liu, B. Luo, and C. Li, “Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning,” Information Sciences, vol. 369, no. 1, pp. 731–747, November 2016.CrossRefGoogle Scholar
  31. [31]
    B. Zhao and Y. Li, “Model-free adaptive dynamic programming based near-optimal decentralized tracking control of reconfigurable manipulators,” International Journal of Control, Automation and Systems, vol. 16, no. 2, pp. 478–490, April 2018.CrossRefGoogle Scholar
  32. [32]
    J. Sun, C. Liu, and N. Liu, “Data-driven adaptive critic approach for nonlinear optimal control via least squares support vector machine,” Asian Journal of Control, vol. 20, no. 1, pp. 104–114, January 2018.MathSciNetzbMATHCrossRefGoogle Scholar
  33. [33]
    V. Draguna and F. Lewis, “Neural network approach to continuoustime direct adaptive optimal control for partially unknown nonlinear systems,” Neural Networks, vol. 22, no. 3, pp. 237–246, April 2009.zbMATHCrossRefGoogle Scholar
  34. [34]
    F. Yaghmaie and J. David, “Reinforcement learning for a class of continuous-time input constrained optimal control problems,” Automatica, vol. 99, pp. 221–227, January 2019.MathSciNetzbMATHCrossRefGoogle Scholar
  35. [35]
    Q. Wei, H. Zhang, D. Liu, and Y. Zhang, “An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming,” Acta Automatica Sinica, vol. 36, no. 1, pp. 121–129, Janurary 2010.MathSciNetzbMATHGoogle Scholar
  36. [36]
    S. Li, L. Ding, H. Gao, Y. J. Liu, N. Li, and Z. Deng, “Reinforcement learning neural network-based adaptive control for state and input time-delayed wheeled mobile robots,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018. DOI:  https://doi.org/10.1109/TSMC.2018.2870724
  37. [37]
    M. Abu-Khalaf and F. L. Lewis, “Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach,” Automatica, vol. 41, no. 5, pp. 779–791, Janurary 2005.MathSciNetzbMATHCrossRefGoogle Scholar
  38. [38]
    D. Liu, X. Yang, and H. Li, “Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics,” Neural Computing and Applications, vol. 23, no. 7–8, pp. 1843–1850, December 2013.CrossRefGoogle Scholar
  39. [39]
    J. Y. Lee, J. B. Park, and Y. H. Choi, “Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 5, pp. 916–932, May 2015.MathSciNetCrossRefGoogle Scholar
  40. [40]
    W. Rudin, Principles of Mathematical Analysis, McGraw-Hill Publishing Co., New York, 1976.zbMATHGoogle Scholar
  41. [41]
    K. Hornik, M. Stinchcombe, H. White, and P. Auer, “Degree of approximation results for feedforward networks approximating unknown mappings and their derivatives,” Neural Computing and Applications, vol. 6, no. 6, pp. 1262–1275, November 1994.zbMATHCrossRefGoogle Scholar
  42. [42]
    Y. Jiang and Z. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, October 2012.MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© ICROS, KIEE and Springer 2019

Authors and Affiliations

  1. 1.College of Automation Engineering of Nanjing University of Aeronautics and AstronauticsNanjingChina

Personalised recommendations