Multi-Agent Q( $$ \lambda $$ ) Learning for Optimal Operation Management of Energy Internet

Yang, Lingxiao; Sun, Qiuye; Han, Yue

doi:10.1007/978-3-319-70136-3_32

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10639))

Included in the following conference series:

International Conference on Neural Information Processing

3487 Accesses

Abstract

This paper proposes an optimal operation management methodology based on the multi-agent reinforcement learning (MARL) in energy internet (EI). An integrated approach to minimize the total cost of operation of such an electrical, natural gas and district heating network simultaneously is studied. A novel multi-agent Q($ \lambda $) learning method is presented to form a coordinated optimal management strategy of energy internet with multiple We-Energy(WE), and an equal interval sampling method is proposed to find the optimal discrete action sets so as to enhance the performance of the control areas. Furthermore, a global Q operator is designed to produce a global Q function considering the local reward from each agent which optimizes simultaneously. The proposed method verified by case studies applied to the modified energy network. Compared with the centralized approach, the test results show that the proposed method can provide a fast solution for the optimal operation management which can be applied to multiple We-Energy internet with sufficient accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shabanpour-Haghighi, A., Seifi, A.: Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf. 11(5), 1067–1077 (2016)
Article Google Scholar
Hu, J., Sun, Q., Teng, F.: A game-theoretic pricing model for energy internet in day-ahead trading market considering distributed generations uncertainty. In: International Conference on Computational Intelligence (SSCI), pp. 1–7. (2016)
Google Scholar
Correa-Posada, C.M., Sánchez-Martín, P.: Integrated power and natural gas model for energy adequacy in short-term operation. IEEE Trans. Power Syst. 30(6), 3347–3355 (2015)
Article Google Scholar
Alabdulwahab, A., Abusorrah, A., Zhang, X.: Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy 6(2), 606–615 (2015)
Article Google Scholar
Xu, X., Jia, H., Chiang, H.D.: Dynamic modeling and interaction of hybrid natural gas and electricity supply system in microgrid. IEEE Trans. Power Syst. 30(3), 1212–1221 (2015)
Article Google Scholar
Zhang, X., Shahidehpour, M., Alabdulwahab, A.: Optimal expansion planning of energy hub with multiple energy infrastructures. IEEE Trans. Smart Grid 6(5), 2302–2311 (2015)
Article Google Scholar
Shabanpour-Haghighi, A., Seifi, A.R.: Energy flow optimization in multicarrier systems. IEEE Trans. Industrial Inf. 11(5), 1067–1077 (2015)
Article Google Scholar
Ríos-Mercado, R.Z., Borraz-Sánchez, C.: Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy 147, 536–555 (2015)
Article Google Scholar
Sun, Q., Han, R., Zhang, H.: A Multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans. Smart Grid 6(6), 3006–3019 (2015)
Article Google Scholar
Liu, D., Wei, Q.: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)
Article Google Scholar
Modares, H., Lewis, F.L., Naghibi-Sistani, M.B.: Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)
Article MATH MathSciNet Google Scholar
Zhang, H., Cui, L., Zhang, X.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)
Article Google Scholar
Kamalapurkar, R., Dinh, H., Bhasin, S.: Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51, 40–48 (2015)
Article MATH MathSciNet Google Scholar
Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)
Article Google Scholar
Kamalapurkar, R., Andrews, L., Walters, P.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 753–758 (2017)
Article Google Scholar
Yu, C., Zhang, M., Ren, F.: Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3083–3096 (2015)
Article MathSciNet Google Scholar
Plamondon, P., Chaib-draa, B., Benaskeur, A.R.: A Q-decomposition and bounded RTDP approach to resource allocation. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 200. (2007)
Google Scholar
Shabanpour-Haghighi, A., Seifi, A.R.: Simultaneous integrated optimal energy flow of electricity, gas, and heat. Energy Convers. Manage. 101, 579–591 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Northeastern University, Shenyang, 110819, Liaoning, China
Lingxiao Yang & Qiuye Sun
State Grid Liaoning Electric Power Research Institute, Shenyang, 110819, Liaoning, China
Yue Han

Authors

Lingxiao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qiuye Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yue Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lingxiao Yang .

Editor information

Editors and Affiliations

Guangdong University of Technology, Guangzhou, China
Derong Liu
Guangdong University of Technology, Guangzhou, China
Shengli Xie
South China University of Technology, Guangzhou, China
Yuanqing Li
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Dongbin Zhao
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
El-Sayed M. El-Alfy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yang, L., Sun, Q., Han, Y. (2017). Multi-Agent Q($ \lambda $) Learning for Optimal Operation Management of Energy Internet. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-70136-3_32
Published: 26 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70135-6
Online ISBN: 978-3-319-70136-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-Agent Q(\( \lambda \)) Learning for Optimal Operation Management of Energy Internet

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Multi-Agent Q(\( \lambda \)) Learning for Optimal Operation Management of Energy Internet

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation