H ∞  Control Synthesis for Linear Parabolic PDE Systems with Model-Free Policy Iteration

  • Biao LuoEmail author
  • Derong Liu
  • Xiong Yang
  • Hongwen Ma
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9377)


The H ∞  control problem is considered for linear parabolic partial differential equation (PDE) systems with completely unknown system dynamics. We propose a model-free policy iteration (PI) method for learning the H ∞  control policy by using measured system data without system model information. First, a finite-dimensional system of ordinary differential equation (ODE) is derived, which accurately describes the dominant dynamics of the parabolic PDE system. Based on the finite-dimensional ODE model, the H ∞  control problem is reformulated, which is theoretically equivalent to solving an algebraic Riccati equation (ARE). To solve the ARE without system model information, we propose a least-square based model-free PI approach by using real system data. Finally, the simulation results demonstrate the effectiveness of the developed model-free PI method.


Parabolic PDE systems H ∞  control model-free policy iteration algebraic Riccati equation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baker, J., Christofides, P.D.: Finite-dimensional approximation and control of non-linear parabolic PDE systems. International Journal of Control 73(5), 439–456 (2000)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Xu, C., Ou, Y., Schuster, E.: Sequential linear quadratic control of bilinear parabolic PDEs based on POD model reduction. Automatica 47(2), 418–426 (2011)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Luo, B., Wu, H.N.: Approximate optimal control design for nonlinear one-dimensional parabolic PDE systems using empirical eigenfunctions and neural network. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 42(6), 1538–1549 (2012)CrossRefGoogle Scholar
  4. 4.
    Wu, H.N., Luo, B.: L 2 disturbance attenuation for highly dissipative nonlinear spatially distributed processes via HJI approach. Journal of Process Control 24(5), 550–567 (2014)CrossRefGoogle Scholar
  5. 5.
    Chen, B.S., Chang, Y.T.: Fuzzy state-space modeling and robust observer-based control design for nonlinear partial differential systems. IEEE Transactions on Fuzzy Systems 17(5), 1025–1043 (2009)CrossRefGoogle Scholar
  6. 6.
    Chang, Y.T., Chen, B.S.: A fuzzy approach for robust reference-tracking-control design of nonlinear distributed parameter time-delayed systems and its application. IEEE Transactions on Fuzzy Systems 18(6), 1041–1057 (2010)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Luo, B., Wu, H.N., Li, H.X.: Adaptive optimal control of highly dissipative nonlinear spatially distributed processes with neuro-dynamic programming. IEEE Transactions on Neural Networks and Learning Systems 26(4), 684–696 (2015)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Schaft, A.V.D.: L 2-Gain and Passivity in Nonlinear Control. Springer-Verlag New York, Inc. (1996)Google Scholar
  9. 9.
    Başar, T., Bernhard, P.: H  ∞  Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach. Springer (2008)Google Scholar
  10. 10.
    Green, M., Limebeer, D.J.: Linear Robust Control. Prentice-Hall, Englewood Cliffs (1995)zbMATHGoogle Scholar
  11. 11.
    Vamvoudakis, K.G., Lewis, F.L.: Online solution of nonlinear two-player zero-sum games using synchronous policy iteration. International Journal of Robust and Nonlinear Control 22(13), 1460–1483 (2012)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Feng, Y., Anderson, B., Rotkowitz, M.: A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H  ∞  control. Automatica 45(4), 881–888 (2009)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Lanzon, A., Feng, Y., Anderson, B.D., Rotkowitz, M.: Computing the positive stabilizing solution to algebraic riccati equations with an indefinite quadratic term via a recursive method. IEEE Transactions on Automatic Control 53(10), 2280–2291 (2008)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Wu, H.N., Luo, B.: Neural network based online simultaneous policy update algorithm for solving the HJI equation in nonlinear H  ∞  control. IEEE Transactions on Neural Networks and Learning Systems 23(12), 1884–1895 (2012)CrossRefGoogle Scholar
  15. 15.
    Luo, B., Wu, H.N.: Computationally efficient simultaneous policy update algorithm for nonlinear H  ∞  state feedback control with Galerkin’s method. International Journal of Robust and Nonlinear Control 23(9), 991–1012 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Wu, H.N., Luo, B.: Simultaneous policy update algorithms for learning the solution of linear continuous-time H  ∞  state feedback control. Information Sciences 222, 472–485 (2013)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Luo, B., Wu, H.N., Huang, T.: Off-policy reinforcement learning for H  ∞  control design. IEEE Transactions on Cybernetics 45(1), 65–76 (2015)CrossRefGoogle Scholar
  18. 18.
    Vrabie, D., Lewis, F.: Adaptive dynamic programming for online solution of a zero-sum differential game. Journal of Control Theory and Applications 9(3), 353–360 (2011)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control. Automatica 43(3), 473–481 (2007)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Jiang, Y., Jiang, Z.P.: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10), 2699–2704 (2012)MathSciNetCrossRefGoogle Scholar
  21. 21.
    Lee, J.Y., Park, J.B., Choi, Y.H.: Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica 48(11), 2850–2859 (2012)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Luo, B., Wu, H.N.: Online policy iteration algorithm for optimal control of linear hyperbolic PDE systems. Journal of Process Control 22(7), 1161–1170 (2012)CrossRefGoogle Scholar
  23. 23.
    Wu, H.N., Luo, B.: Heuristic dynamic programming algorithm for optimal control design of linear continuous-time hyperbolic PDE systems. Industrial & Engineering Chemistry Research 51(27), 9310–9319 (2012)CrossRefGoogle Scholar
  24. 24.
    Luo, B., Wu, H.N., Li, H.X.: Data-based suboptimal neuro-control design with reinforcement learning for dissipative spatially distributed processes. Industrial & Engineering Chemistry Research 53(29), 8106–8119 (2014)CrossRefGoogle Scholar
  25. 25.
    Kleinman, D.L.: On an iterative technique for Riccati equation computations. IEEE Transactions on Automatic Control 13(1), 114–115 (1968)CrossRefGoogle Scholar
  26. 26.
    Vrabie, D., Pastravanu, O., Abu-Khalaf, M., Lewis, F.L.: Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica 45(2), 477–484 (2009)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

<SimplePara><Emphasis Type="Bold">Open Access</Emphasis> This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. </SimplePara> <SimplePara>The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.</SimplePara>

Authors and Affiliations

  1. 1.The State Key Laboratory of Management and Control for Complex Systems, Institute of AutomationChinese Academy of SciencesBeijingChina
  2. 2.School of Automation and Electrical EngineeringUniversity of Science and Technology BeijingBeijingChina

Personalised recommendations