Energy Internet and We-Energy pp 299-325 | Cite as

# Model-Free Energy Optimization for Energy Internet

## Abstract

With the high penetration of distributed energy, the scale of current energy network becomes larger. At the same time, it also has the problems of complex computation and slow convergence. In order to realize the rational planning and utilize various energy resources and improve the reliability and economy of the overall system, the problems of system stability, economic operation and power flow calculation must be considered comprehensively. As a kind of machine learning, reinforcement learning has strong intelligence and rapidity, which can realize the optimal control of the system. Aiming at the energy management model of regional energy Internet, this chapter studies how to transform energy management into Q learning model, and uses Q learning algorithm to verify the validity of the model. In the meantime, for the optimization scheduling problem of large-scale system, this chapter expands the optimal power flow model of energy internet into the optimal operation structure composed of multiple We-Energies based on the previous one, and uses the distributed reinforcement learning algorithm to optimize the large-scale energy internet scheduling which considers the average consistent information search to achieve the optimization process for cooperating and communicating multiple We-Energies.

## References

- 1.A. Shabanpour-Haghighi, A.R. Seifi, Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf.
**11**(5), 1067–1077 (2015)CrossRefGoogle Scholar - 2.J. Hu, Q. Sun, T. Fei, A game-theoretic pricing model for energy internet in day-ahead trading market considering distributed generations uncertainty. in
*IEEE Symposium Series on Computational Intelligence*(2016)Google Scholar - 3.R.Z. R´ıos-Mercado, C. Borraz-S´anchez, Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy
**147**, 536–555 (2015)Google Scholar - 4.C.M. Correa-Posada.; P. Sanchez-Martin, Integrated power and natural gas model for energy adequacy in short-term operation. IEEE Trans. Power Syst.,
**30(**6), 3347–3355 (2015)CrossRefGoogle Scholar - 5.A. Alabdulwahab, A. Abusorrah, X. Zhang, X, Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy
**6**(2), 606–615 (2015)CrossRefGoogle Scholar - 6.M. Chaudry, N. Jenkins, M. Qadrdan et al., Combined gas and electricity network expansion planning. Appl. Energy
**113**(6), 1171–1187 (2014)CrossRefGoogle Scholar - 7.X. Xu, H. Jia, H.D. Chiang et al., Dynamic modeling and interaction of hybrid natural gas and electricity supply system in microgrid. IEEE Trans. Power Syst.
**30**(3), 1212–1221 (2015)CrossRefGoogle Scholar - 8.X. Zhang, M. Shahidehpour, A. Alabdulwahab et al., Optimal expansion planning of energy hub with multiple energy infrastructures. IEEE Trans. Smart Grid
**6**(5), 2302–2311 (2015)CrossRefGoogle Scholar - 9.C. Liu, M. Shahidehpour, J. Wang, Coordinated scheduling of electricity and natural gas infrastructures with a transient model for natural gas flow.
*Chaos***21(**21) (2011)CrossRefGoogle Scholar - 10.X. Zhang, G.G. Karady, S.T. Ariaratnam, Optimal allocation of CHP-based distributed generation on urban energy distribution networks. IEEE Trans. Sustain. Energy
**5**(5), 246–253 (2014)CrossRefGoogle Scholar - 11.X. Fei, M. Yong, Combined electricity-heat operation system containing large capacity thermal energy storage. Proc. CSEE
**34**(29), 5063–5072 (2014)Google Scholar - 12.G. Zepeng, K. Chongqing, Operation optimization of integrated power and heat energy systems and the benefit on wind power accommodation considering heating network constraints. Proc. CSEE
**35**(14), 3596–3604 (2015)Google Scholar - 13.Z. Pan, H. Sun, Q. Guo, Energy Internet oriented security analysis method for multi-energy flow. Power Syst. Technol.
**40**, 1627–1634 (2016)Google Scholar - 14.Q. Huang, M.L. Crow, G.T. Heydt, J.P. Zheng, S.J. Dale, The future renewable electric energy delivery and management (FREEDM) system: the energy internet. Proc. IEEE
**99**(1), 133–148 (2011)CrossRefGoogle Scholar - 15.C. Chen, S. Duan, T. Cai, B. Liu, G. Hu, Smart energy management system for optimal microgrid economic operation. IET Renew. Power Gener.
**5(**3), 258–267, (2011)CrossRefGoogle Scholar - 16.X. Ou, Y. Shen, Z. Zeng, G. Zhang, L. Wang, Cost minimization online energy management for microgrids with power and thermal storages. in
*24th International Conference on Computer Communication and Networks (ICCCN), Las Vegas, NV*(2015), pp. 1–6Google Scholar - 17.M. Motevasel, A.R. Seifi, T. Niknam, Multi-objective energy management of CHP (combined heat and power)-based micro-grid. Energy
**51**, 123–136 (2013)CrossRefGoogle Scholar - 18.L. Ma, N. Liu, J. Zhang, W. Tushar, C. Yuen, Energy management for joint operation of CHP and PV prosumers inside a grid-connected microgrid: a game theoretic approach. IEEE Trans. Ind. Inf.
**12**(5), 1930–1942 (2016)CrossRefGoogle Scholar - 19.A. Shabanpour-Haghighi, A. R. Seifi, Energy flow optimization in multicarrier systems.
*IEEE Trans. Ind. Inf.***11(**5) (2015)CrossRefGoogle Scholar - 20.T. Krause, G. Andersson, K. Frohlich, A. Vaccaro, Multiple-energy carriers:modeling of production, delivery, and consumption.
*Proc IEEE,***99,**15–27 (2011)CrossRefGoogle Scholar - 21.B. Stott, J.L. Marinho, OAlsac, Review of linear programming applied to power system rescheduling. in
*Power Industry Computer Applications Conference*(1979), pp. 142–154Google Scholar - 22.K.A. Clements, P.W. Davis., K.D. Frey, An interior point algorithm for weighted least absolute value power system state Eestimation (1991)Google Scholar
- 23.R.Z. Ríos-Mercado, C. Borraz-Sánchez, Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy
**147**, 536–555 (2015)CrossRefGoogle Scholar - 24.Q. Sun, R. Han, H. Zhang et al., A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans. Smart Grid
**6**(6), 3006–3019 (2015)CrossRefGoogle Scholar - 25.Y. Xu, W. Zhang, W. Liu et al., Multiagent-based reinforcement learning for optimal reactive power dispatch. IEEE Trans. Syst. Man Cybern. Part C
**42**(6), 1742–1751 (2012)CrossRefGoogle Scholar - 26.K. Doya, Reinforcement learning in continuous time and space. Neural Comput.
**12**(1), 219–245 (2000)CrossRefGoogle Scholar - 27.Z. Chen, S. Jagannathan, Generalized Jamilton–Jacobi–Bellman formulation -based neural network control of affine nonlinear discrete-time systems.
*IEEE Trans. Neural Netw.***19(**1), 90–106 (2008)CrossRefGoogle Scholar - 28.T. Dierks, B. TThumati, S. Jagannathan, Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Netw.
**22**(5–6), 851–860 (2009)CrossRefGoogle Scholar - 29.D. Liu, Q. Wei, Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst.
**25(**3), 621–634 (2014)Google Scholar - 30.H. Modares, F.L. Lewis, Naghibi-Sistani M B. Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica
**50**(1), 193–202 (2014)MathSciNetCrossRefGoogle Scholar - 31.H. Zhang, L. Cui, X. Zhang et al., Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw.
**22**(12), 2226–2236 (2011)CrossRefGoogle Scholar - 32.R. Kamalapurkar, H. Dinh, S. Bhasin et al., Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica
**51**, 40–48 (2015)MathSciNetCrossRefGoogle Scholar - 33.B. Kiumarsi, F.L. Lewis, H. Modares et al., Reinforcement [formula omitted]-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica
**50**(14), 1167–1175 (2014)MathSciNetCrossRefGoogle Scholar - 34.H. Zhang, Q. Wei, Y. Luo, A novel infinite-time optimal tracking control scheme for a class of discrete-time nonlinear systems via the Greedy HDP iteration algorithm. IEEE Trans. Syst. Man Cybern. Part B Cybern.
**38**(4), 937–942 (2008)CrossRefGoogle Scholar - 35.R. Kamalapurkar, L. Andrews, P. Walters et al., Model-based reinforcement learning for infinite-horizon approximate optimal tracking. in
*Decision and Control.*(IEEE, 2015), pp 5083–5088Google Scholar - 36.C. Yu, M. Zhang, F. Ren et al., Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst.
**26**(12), 3083–3096 (2015)MathSciNetCrossRefGoogle Scholar - 37.P. Plamondon, B. Chaib-Draa, A.R. Benaskeur, A Q-decomposition and bounded RTDP approach to resource allocation. in
*Autonomous Agents & Multiagent Systems/Agent Theories, Architectures, and Languages*(2007), pp 1–8Google Scholar - 38.L. Matignon, G.J. Laurent, N.L. Fort-Piat, Coordination of independent learners in cooperative Markov games. Piat (2009)Google Scholar