Skip to main content

Multi-Agent Q(\( \lambda \)) Learning for Optimal Operation Management of Energy Internet

  • Conference paper
  • First Online:
Neural Information Processing (ICONIP 2017)

Abstract

This paper proposes an optimal operation management methodology based on the multi-agent reinforcement learning (MARL) in energy internet (EI). An integrated approach to minimize the total cost of operation of such an electrical, natural gas and district heating network simultaneously is studied. A novel multi-agent Q(\( \lambda \)) learning method is presented to form a coordinated optimal management strategy of energy internet with multiple We-Energy(WE), and an equal interval sampling method is proposed to find the optimal discrete action sets so as to enhance the performance of the control areas. Furthermore, a global Q operator is designed to produce a global Q function considering the local reward from each agent which optimizes simultaneously. The proposed method verified by case studies applied to the modified energy network. Compared with the centralized approach, the test results show that the proposed method can provide a fast solution for the optimal operation management which can be applied to multiple We-Energy internet with sufficient accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Shabanpour-Haghighi, A., Seifi, A.: Energy flow optimization in multicarrier systems. IEEE Trans. Ind. Inf. 11(5), 1067–1077 (2016)

    Article  Google Scholar 

  2. Hu, J., Sun, Q., Teng, F.: A game-theoretic pricing model for energy internet in day-ahead trading market considering distributed generations uncertainty. In: International Conference on Computational Intelligence (SSCI), pp. 1–7. (2016)

    Google Scholar 

  3. Correa-Posada, C.M., Sánchez-Martín, P.: Integrated power and natural gas model for energy adequacy in short-term operation. IEEE Trans. Power Syst. 30(6), 3347–3355 (2015)

    Article  Google Scholar 

  4. Alabdulwahab, A., Abusorrah, A., Zhang, X.: Coordination of interdependent natural gas and electricity infrastructures for firming the variability of wind energy in stochastic day-ahead scheduling. IEEE Trans. Sustain. Energy 6(2), 606–615 (2015)

    Article  Google Scholar 

  5. Xu, X., Jia, H., Chiang, H.D.: Dynamic modeling and interaction of hybrid natural gas and electricity supply system in microgrid. IEEE Trans. Power Syst. 30(3), 1212–1221 (2015)

    Article  Google Scholar 

  6. Zhang, X., Shahidehpour, M., Alabdulwahab, A.: Optimal expansion planning of energy hub with multiple energy infrastructures. IEEE Trans. Smart Grid 6(5), 2302–2311 (2015)

    Article  Google Scholar 

  7. Shabanpour-Haghighi, A., Seifi, A.R.: Energy flow optimization in multicarrier systems. IEEE Trans. Industrial Inf. 11(5), 1067–1077 (2015)

    Article  Google Scholar 

  8. Ríos-Mercado, R.Z., Borraz-Sánchez, C.: Optimization problems in natural gas transportation systems: a state-of-the-art review. Appl. Energy 147, 536–555 (2015)

    Article  Google Scholar 

  9. Sun, Q., Han, R., Zhang, H.: A Multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy internet. IEEE Trans. Smart Grid 6(6), 3006–3019 (2015)

    Article  Google Scholar 

  10. Liu, D., Wei, Q.: Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(3), 621–634 (2014)

    Article  Google Scholar 

  11. Modares, H., Lewis, F.L., Naghibi-Sistani, M.B.: Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014)

    Article  MATH  MathSciNet  Google Scholar 

  12. Zhang, H., Cui, L., Zhang, X.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22(12), 2226–2236 (2011)

    Article  Google Scholar 

  13. Kamalapurkar, R., Dinh, H., Bhasin, S.: Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica 51, 40–48 (2015)

    Article  MATH  MathSciNet  Google Scholar 

  14. Wei, Q., Liu, D.: Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans. Autom. Sci. Eng. 11(4), 1020–1036 (2014)

    Article  Google Scholar 

  15. Kamalapurkar, R., Andrews, L., Walters, P.: Model-based reinforcement learning for infinite-horizon approximate optimal tracking. IEEE Trans. Neural Netw. Learn. Syst. 28(3), 753–758 (2017)

    Article  Google Scholar 

  16. Yu, C., Zhang, M., Ren, F.: Emotional multiagent reinforcement learning in spatial social dilemmas. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3083–3096 (2015)

    Article  MathSciNet  Google Scholar 

  17. Plamondon, P., Chaib-draa, B., Benaskeur, A.R.: A Q-decomposition and bounded RTDP approach to resource allocation. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p. 200. (2007)

    Google Scholar 

  18. Shabanpour-Haghighi, A., Seifi, A.R.: Simultaneous integrated optimal energy flow of electricity, gas, and heat. Energy Convers. Manage. 101, 579–591 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingxiao Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Yang, L., Sun, Q., Han, Y. (2017). Multi-Agent Q(\( \lambda \)) Learning for Optimal Operation Management of Energy Internet. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10639. Springer, Cham. https://doi.org/10.1007/978-3-319-70136-3_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70136-3_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70135-6

  • Online ISBN: 978-3-319-70136-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics