Abstract
This paper proposes an optimal operation management methodology based on the multi-agent reinforcement learning (MARL) in energy internet (EI). An integrated approach to minimize the total cost of operation of such an electrical, natural gas and district heating network simultaneously is studied. A novel multi-agent Q(\( \lambda \)) learning method is presented to form a coordinated optimal management strategy of energy internet with multiple We-Energy(WE), and an equal interval sampling method is proposed to find the optimal discrete action sets so as to enhance the performance of the control areas. Furthermore, a global Q operator is designed to produce a global Q function considering the local reward from each agent which optimizes simultaneously. The proposed method verified by case studies applied to the modified energy network. Compared with the centralized approach, the test results show that the proposed method can provide a fast solution for the optimal operation management which can be applied to multiple We-Energy internet with sufficient accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shabanpour-Haghighi, A., Seifi, A.: Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf. 11(5), 1067–1077 (2016)
Hu, J., Sun, Q., Teng, F.: A game-theoretic pricing model for energy internet in day-ahead trading market considering distributed generations uncertainty. In: International Conference on Computational Intelligence (SSCI), pp. 1–7. (2016)
Correa-Posada, C.M., Sánchez-MartÃn, P.: Integrated power and natural gas model for energy adequacy in short-term operation. IEEE Trans. Power Syst. 30(6), 3347–3355 (2015)
Alabdulwahab, A., Abusorrah, A., Zhang, X.: Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy 6(2), 606–615 (2015)
Xu, X., Jia, H., Chiang, H.D.: Dynamic modeling and interaction of hybrid natural gas and electricity supply system in microgrid. IEEE Trans. Power Syst. 30(3), 1212–1221 (2015)
Zhang, X., Shahidehpour, M., Alabdulwahab, A.: Optimal expansion planning of energy hub with multiple energy infrastructures. IEEE Trans. Smart Grid 6(5), 2302–2311 (2015)
Shabanpour-Haghighi, A., Seifi, A.R.: Energy flow optimization in multicarrier systems. IEEE Trans. Industrial Inf. 11(5), 1067–1077 (2015)
RÃos-Mercado, R.Z., Borraz-Sánchez, C.: Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy 147, 536–555 (2015)
Sun, Q., Han, R., Zhang, H.: A Multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans. Smart Grid 6(6), 3006–3019 (2015)
Liu, D., Wei, Q.: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)
Modares, H., Lewis, F.L., Naghibi-Sistani, M.B.: Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)
Zhang, H., Cui, L., Zhang, X.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)
Kamalapurkar, R., Dinh, H., Bhasin, S.: Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51, 40–48 (2015)
Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)
Kamalapurkar, R., Andrews, L., Walters, P.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 753–758 (2017)
Yu, C., Zhang, M., Ren, F.: Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3083–3096 (2015)
Plamondon, P., Chaib-draa, B., Benaskeur, A.R.: A Q-decomposition and bounded RTDP approach to resource allocation. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 200. (2007)
Shabanpour-Haghighi, A., Seifi, A.R.: Simultaneous integrated optimal energy flow of electricity, gas, and heat. Energy Convers. Manage. 101, 579–591 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yang, L., Sun, Q., Han, Y. (2017). Multi-Agent Q(\( \lambda \)) Learning for Optimal Operation Management of Energy Internet. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-70136-3_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70135-6
Online ISBN: 978-3-319-70136-3
eBook Packages: Computer ScienceComputer Science (R0)