Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems

Song, Ruizhuo; Wei, Qinglai; Li, Qing

doi:10.1007/978-981-13-1712-5_7

Ruizhuo Song⁵,
Qinglai Wei⁶ &
Qing Li⁵

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 166))

622 Accesses
1 Citations

Abstract

This chapter establishes an optimal control of unknown complex-valued system. Policy iteration (PI) is used to obtain the solution of the Hamilton–Jacobi–Bellman (HJB) equation. Off-policy learning allows the iterative performance index and iterative control to be obtained by completely unknown dynamics. Critic and action networks are used to get the iterative control and iterative performance index, which execute policy evaluation and policy improvement. Asymptotic stability of the closed-loop system and the convergence of the iterative performance index function are proven. By Lyapunov technique, the uniformly ultimately bounded (UUB) of the weight error is proven. Simulation study demonstrates the effectiveness of the proposed optimal control method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sutton, R., Barto, A.: Reinforcement Learning: An Introduction, A Bradford Book. The MIT Press, Cambridge (2005)
Google Scholar
Lewis, F., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits Syst. Mag. 9(3), 32–50 (2009)
Article Google Scholar
Al-Tamimi, A., Lewis, F., Abu-Khalaf, M.: Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans. Syst. Man Cybern. B Cybern. 38(4), 943–949 (2008)
Article Google Scholar
Murray, J., Cox, C., Lendaris, G., Saeks, R.: Adaptive dynamic programming. IEEE Trans. Syst. Man Cybern. Syst. 32(2), 140–153 (2002)
Article Google Scholar
Modares, H., Lewis, F., Jiang, Z.: \(H_\infty \) tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst. 26(10), 2550–2562 (2015)
Article MathSciNet Google Scholar
Song, R., Xiao, W., Zhang, H., Sun, C.: Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(9), 1733–1739 (2014)
Article Google Scholar
Wang, J., Xu, X., Liu, D., Sun, Z., Chen, Q.: Self-learning cruise control using kernel-based least squares policy iteration. IEEE Trans. Control Syst. Technol. 22(3), 1078–1087 (2014)
Article Google Scholar
Luo, B., Wu, H., Huang, T., Liu, D.: Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design. Automatica 50(12), 3281–3290 (2014)
Article MathSciNet Google Scholar
Modares, H., Lewis, F.: Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning. IEEE Trans. Autom. Control 59, 3051–3056 (2014)
Article MathSciNet Google Scholar
Kiumarsi, B., Lewis, F., Modares, H., Karimpur, A., Naghibi-Sistani, M.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4), 1167–1175 (2014)
Article MathSciNet Google Scholar
Abu-Khalaf, M., Lewis, F.: Nearly optimal control laws for nonlinear systems withsaturating actuators using a neural network HJB approach. Automatica 41, 779–791 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Science and Technology Beijing, Beijing, China
Ruizhuo Song & Qing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Qinglai Wei

Authors

Ruizhuo Song
View author publications
You can also search for this author in PubMed Google Scholar
Qinglai Wei
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruizhuo Song .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Song, R., Wei, Q., Li, Q. (2019). Off-Policy Neuro-Optimal Control for Unknown Complex-Valued Nonlinear Systems. In: Adaptive Dynamic Programming: Single and Multiple Controllers. Studies in Systems, Decision and Control, vol 166. Springer, Singapore. https://doi.org/10.1007/978-981-13-1712-5_7

Download citation

DOI: https://doi.org/10.1007/978-981-13-1712-5_7
Published: 29 December 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1711-8
Online ISBN: 978-981-13-1712-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics