Model-Free Energy Optimization for Energy Internet

  • Qiuye SunEmail author
Part of the Renewable Energy Sources & Energy Storage book series (RESES)


With the high penetration of distributed energy, the scale of current energy network becomes larger. At the same time, it also has the problems of complex computation and slow convergence. In order to realize the rational planning and utilize various energy resources and improve the reliability and economy of the overall system, the problems of system stability, economic operation and power flow calculation must be considered comprehensively. As a kind of machine learning, reinforcement learning has strong intelligence and rapidity, which can realize the optimal control of the system. Aiming at the energy management model of regional energy Internet, this chapter studies how to transform energy management into Q learning model, and uses Q learning algorithm to verify the validity of the model. In the meantime, for the optimization scheduling problem of large-scale system, this chapter expands the optimal power flow model of energy internet into the optimal operation structure composed of multiple We-Energies based on the previous one, and uses the distributed reinforcement learning algorithm to optimize the large-scale energy internet scheduling which considers the average consistent information search to achieve the optimization process for cooperating and communicating multiple We-Energies.


  1. 1.
    A. Shabanpour-Haghighi, A.R. Seifi, Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf. 11(5), 1067–1077 (2015)CrossRefGoogle Scholar
  2. 2.
    J. Hu, Q. Sun, T. Fei, A game-theoretic pricing model for energy internet in day-ahead trading market considering distributed generations uncertainty. in IEEE Symposium Series on Computational Intelligence (2016)Google Scholar
  3. 3.
    R.Z. R´ıos-Mercado, C. Borraz-S´anchez, Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy 147, 536–555 (2015)Google Scholar
  4. 4.
    C.M. Correa-Posada.; P. Sanchez-Martin, Integrated power and natural gas model for energy adequacy in short-term operation. IEEE Trans. Power Syst., 30(6), 3347–3355 (2015)CrossRefGoogle Scholar
  5. 5.
    A. Alabdulwahab, A. Abusorrah, X. Zhang, X, Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy 6(2), 606–615 (2015)CrossRefGoogle Scholar
  6. 6.
    M. Chaudry, N. Jenkins, M. Qadrdan et al., Combined gas and electricity network expansion planning. Appl. Energy 113(6), 1171–1187 (2014)CrossRefGoogle Scholar
  7. 7.
    X. Xu, H. Jia, H.D. Chiang et al., Dynamic modeling and interaction of hybrid natural gas and electricity supply system in microgrid. IEEE Trans. Power Syst. 30(3), 1212–1221 (2015)CrossRefGoogle Scholar
  8. 8.
    X. Zhang, M. Shahidehpour, A. Alabdulwahab et al., Optimal expansion planning of energy hub with multiple energy infrastructures. IEEE Trans. Smart Grid 6(5), 2302–2311 (2015)CrossRefGoogle Scholar
  9. 9.
    C. Liu, M. Shahidehpour, J. Wang, Coordinated scheduling of electricity and natural gas infrastructures with a transient model for natural gas flow. Chaos 21(21) (2011)CrossRefGoogle Scholar
  10. 10.
    X. Zhang, G.G. Karady, S.T. Ariaratnam, Optimal allocation of CHP-based distributed generation on urban energy distribution networks. IEEE Trans. Sustain. Energy 5(5), 246–253 (2014)CrossRefGoogle Scholar
  11. 11.
    X. Fei, M. Yong, Combined electricity-heat operation system containing large capacity thermal energy storage. Proc. CSEE 34(29), 5063–5072 (2014)Google Scholar
  12. 12.
    G. Zepeng, K. Chongqing, Operation optimization of integrated power and heat energy systems and the benefit on wind power accommodation considering heating network constraints. Proc. CSEE 35(14), 3596–3604 (2015)Google Scholar
  13. 13.
    Z. Pan, H. Sun, Q. Guo, Energy Internet oriented security analysis method for multi-energy flow. Power Syst. Technol. 40, 1627–1634 (2016)Google Scholar
  14. 14.
    Q. Huang, M.L. Crow, G.T. Heydt, J.P. Zheng, S.J. Dale, The future renewable electric energy delivery and management (FREEDM) system: the energy internet. Proc. IEEE 99(1), 133–148 (2011)CrossRefGoogle Scholar
  15. 15.
    C. Chen, S. Duan, T. Cai, B. Liu, G. Hu, Smart energy management system for optimal microgrid economic operation. IET Renew. Power Gener. 5(3), 258–267, (2011)CrossRefGoogle Scholar
  16. 16.
    X. Ou, Y. Shen, Z. Zeng, G. Zhang, L. Wang, Cost minimization online energy management for microgrids with power and thermal storages. in 24th International Conference on Computer Communication and Networks (ICCCN), Las Vegas, NV (2015), pp. 1–6Google Scholar
  17. 17.
    M. Motevasel, A.R. Seifi, T. Niknam, Multi-objective energy management of CHP (combined heat and power)-based micro-grid. Energy 51, 123–136 (2013)CrossRefGoogle Scholar
  18. 18.
    L. Ma, N. Liu, J. Zhang, W. Tushar, C. Yuen, Energy management for joint operation of CHP and PV prosumers inside a grid-connected microgrid: a game theoretic approach. IEEE Trans. Ind. Inf. 12(5), 1930–1942 (2016)CrossRefGoogle Scholar
  19. 19.
    A. Shabanpour-Haghighi, A. R. Seifi, Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf. 11(5) (2015)CrossRefGoogle Scholar
  20. 20.
    T. Krause, G. Andersson, K. Frohlich, A. Vaccaro, Multiple-energy carriers:modeling of production, delivery, and consumption. Proc IEEE, 99, 15–27 (2011)CrossRefGoogle Scholar
  21. 21.
    B. Stott, J.L. Marinho, OAlsac, Review of linear programming applied to power system rescheduling. in Power Industry Computer Applications Conference (1979), pp. 142–154Google Scholar
  22. 22.
    K.A. Clements, P.W. Davis., K.D. Frey, An interior point algorithm for weighted least absolute value power system state Eestimation (1991)Google Scholar
  23. 23.
    R.Z. Ríos-Mercado, C. Borraz-Sánchez, Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy 147, 536–555 (2015)CrossRefGoogle Scholar
  24. 24.
    Q. Sun, R. Han, H. Zhang et al., A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans. Smart Grid 6(6), 3006–3019 (2015)CrossRefGoogle Scholar
  25. 25.
    Y. Xu, W. Zhang, W. Liu et al., Multiagent-based reinforcement learning for optimal reactive power dispatch. IEEE Trans. Syst. Man Cybern. Part C 42(6), 1742–1751 (2012)CrossRefGoogle Scholar
  26. 26.
    K. Doya, Reinforcement learning in continuous time and space. Neural Comput. 12(1), 219–245 (2000)CrossRefGoogle Scholar
  27. 27.
    Z. Chen, S. Jagannathan, Generalized Jamilton–Jacobi–Bellman formulation -based neural network control of affine nonlinear discrete-time systems. IEEE Trans. Neural Netw. 19(1), 90–106 (2008)CrossRefGoogle Scholar
  28. 28.
    T. Dierks, B. TThumati, S. Jagannathan, Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw. 22(5–6), 851–860 (2009)CrossRefGoogle Scholar
  29. 29.
    D. Liu, Q. Wei, Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)Google Scholar
  30. 30.
    H. Modares, F.L. Lewis, Naghibi-Sistani M B. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)MathSciNetCrossRefGoogle Scholar
  31. 31.
    H. Zhang, L. Cui, X. Zhang et al., Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)CrossRefGoogle Scholar
  32. 32.
    R. Kamalapurkar, H. Dinh, S. Bhasin et al., Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51, 40–48 (2015)MathSciNetCrossRefGoogle Scholar
  33. 33.
    B. Kiumarsi, F.L. Lewis, H. Modares et al., Reinforcement [formula omitted]-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(14), 1167–1175 (2014)MathSciNetCrossRefGoogle Scholar
  34. 34.
    H. Zhang, Q. Wei, Y. Luo, A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the Greedy HDP iteration algorithm. IEEE Trans. Syst. Man Cybern. Part B Cybern. 38(4), 937–942 (2008)CrossRefGoogle Scholar
  35. 35.
    R. Kamalapurkar, L. Andrews, P. Walters et al., Model-based reinforcement learning for infinite-horizon approximate optimal tracking. in Decision and Control. (IEEE, 2015), pp 5083–5088Google Scholar
  36. 36.
    C. Yu, M. Zhang, F. Ren et al., Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3083–3096 (2015)MathSciNetCrossRefGoogle Scholar
  37. 37.
    P. Plamondon, B. Chaib-Draa, A.R. Benaskeur, A Q-decomposition and bounded RTDP approach to resource allocation. in Autonomous Agents & Multiagent Systems/Agent Theories, Architectures, and Languages (2007), pp 1–8Google Scholar
  38. 38.
    L. Matignon, G.J. Laurent, N.L. Fort-Piat, Coordination of independent learners in cooperative Markov games. Piat (2009)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.College of Information Science and EngineeringNortheastern UniversityShenyangChina

Personalised recommendations