Hierarchical multiagent reinforcement learning schemes for air traffic management

Spatharis, Christos; Bastas, Alevizos; Kravaris, Theocharis; Blekas, Konstantinos; Vouros, George A.; Cordero, Jose Manuel

doi:10.1007/s00521-021-05748-7

Hierarchical multiagent reinforcement learning schemes for air traffic management

S.I. : Information, Intelligence, Systems and Applications
Published: 10 February 2021

Volume 35, pages 147–159, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Christos Spatharis¹,
Alevizos Bastas²,
Theocharis Kravaris²,
Konstantinos Blekas¹,
George A. Vouros ORCID: orcid.org/0000-0001-5451-622X² &
…
Jose Manuel Cordero³

1058 Accesses
12 Citations
Explore all metrics

Abstract

In this work we investigate the use of hierarchical multiagent reinforcement learning methods for the computation of policies to resolve congestion problems in the air traffic management domain. To address cases where the demand of airspace use exceeds capacity, we consider agents representing flights, who need to decide on ground delays at the pre-tactical stage of operations, towards executing their trajectories while adhering to airspace capacity constraints. Hierarchical reinforcement learning manages to handle real-world problems with high complexity, by partitioning the task into hierarchies of states and/or actions. This provides an efficient way of exploring the state–action space and constructing an advantageous decision-making mechanism. We first establish a general framework of hierarchical multiagent reinforcement learning, and then, we further formulate four alternative schemes of abstractions, on states, actions, or both. To quantitatively assess the quality of solutions of the proposed approaches and show the potential of the hierarchical methods in resolving the demand–capacity balance problem, we provide experimental results on real-world evaluation cases, where we measure the average delay per flight and the number of flights with delays.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-agent deep reinforcement learning: a survey

Article Open access 15 April 2021

Monte Carlo Tree Search: a review of recent modifications and applications

Article Open access 19 July 2022

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

http://www.eurocontrol.int/articles/air-traffic-flow-and-capacity-management.
The mapping of joint states to abstract joint states is straightforward using \(\phi _L\), given that each joint state is a concatenation of local state parameters.

References

Abel D, Hershkowitz DE, Littman ML (2016) Near optimal behavior via approximate state abstraction. In: International conference on machine learning (ICML‘16), vol 48, pp 2915–2923
Agogino AK, Tumer K (2012) A multiagent approach to managing air traffic flow. Auton Agents Multiagent Syst 24:1–25
Article Google Scholar
Andreas J, Klein D, Levine S (2017) Modular multitask reinforcement learning with policy sketches. In: 34th international conference on machine learning (ICML), pp 166–1751
Andrienko G, Andrienko N, Bak P, Keim D, Wrobel S (2013) Visual analytics of movement. Springer, Berlin
Book Google Scholar
Bacon P, Harb J, Precup D (2017) The option-critic architecture. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, AAAI’17, pp 1726–1734. AAAI Press
Bai A, Srivastava S, Russell S (2016) Markovian state and action abstractions for MDPs via hierarchical MCTS. In: Proceedings of the twenty-fifth international joint conference on artificial intelligence, IJCAI’16, pp 3029–3037. AAAI Press
Bazzan ALC, Wahle J, Klügl F (1999) Agents in traffic modelling—from reactive to social behaviour. In: 23rd annual german conference on artificial intelligence, pp 303–306
Chen J, Wang Z, Tomizuka M (2018) Deep hierarchical reinforcement learning for autonomous driving with distinct behaviors. In: 2018 IEEE intelligent vehicles symposium (IV), pp 1239–1244
Colby M, Tumer K (2013) Multiagent reinforcement learnng in a distributed sensor network with indirect feedback. In: International conference on autonomous agents and multi-agent systems (AAMAS’13), pp 941–948
Cook AJ, Tanner G (2015) European airline delay cost reference values. http://www.eurocontrol.int/publications/european-airline-delaycost-reference-values
Dayan P, Hinton GE (1992) Feudal reinforcement learning. In: Advances in neural information processing systems, [NIPS Conference], vol 5, pp 271–278
Delvin S, Yliniemi L, Kudenko D, Tumer K (2014) Potential-based difference rewars for multiagent reinforcement learning. In: International conference on autonomous agents and multi-agent systems (AAMAS’14), pp 165–172
Dietterich T (2000) Hierarchical reinforcement learning with the maxq value function decomposition. J Artif Intell Res 13:227–303
Article MathSciNet MATH Google Scholar
Frans K, Ho J, Chen X, Abbeel P, Schulman J (2017) Meta learning shared hierarchies. Technical Report. arXiv preprint arXiv:1710.09767
Guestrin C, Lagoudakis M, Parr R (2002) Coordinated reinforcement learning. In: International conference on machine learning (ICML‘02), pp 227–234
Jong N, Stone P (2005) State abstraction discovery from irrelevant state variables. In: International joint conference on artificial intelligence (IJCAI ’05), pp 752–757
Karp R, Koutsoupias E, Papadimitriou C, Shenker S (2000) Optimization problems in congestion control. In: 16th Annual symposium on foundations of computer science, pp 66–74
Kok JR, Vlassis N (2006) Collaborative multiagent reinforcement learning by payoff propagation. J Mach Learn Res 7:1789–1828
MathSciNet MATH Google Scholar
Konidaris G, Barto A (2009) Efficient skill learning using abstraction selection. In: International joint conference on artificial intelligence (IJCAI ’09), pp 1107–1112
Kravaris T, Spatharis C, Bastas A, Vouros GA, Blekas K, Andrienko G, Andrienko N, Garcia JM (2019) Resolving congestions in the air traffic management domain via multiagent reinforcement learning methods. Technical Report. arXiv preprint arXiv:1912.06860
Kravaris T, Vouros G, Spatharis C, Blekas K, Chalkiadakis G (2017) Learning policies for resolving demand–capacity imbalances during pre-tactical air traffic management. In: Multiagent system technologies—15th German conference (MATES‘17), pp 238–255
Kulkarni T, Narasimhan K, Saeedi A, Tenenbaum J (2016) Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In: Advances in Neural Information Processing Systems (NIPS’16), pp 3675–3683
Li L, Walsh T, Littman M (2006) Towards a unified theory of state abstraction for MDPs. In: International symposium on artificial intelligence and mathematics (ISAIM‘06)
Ma A, Ouimet M, Cortés J (2020) Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning. Auton Robot 44:485–503. https://doi.org/10.1007/s10514-019-09871-2
Article Google Scholar
Makar R, Mahadevan S, Ghavamzadeh M (2001) Hierarchical multi-agent reinforcement learning. In: Proceedings of the fifth international conference on autonomous agents, AGENTS’01, pp 246–253
Malialis K, Delvin S, Kudenko D (2016) Resource abstraction for reinforcement learning in multiagent congestion problems. In: International conference on autonomous agents and multi-agent systems (AAMAS’16), pp 503–511
Mannor S, Menanche I, Hoze A, Klein U (2004) Dynamic abstraction in reinforcement learning via clustering. In: International conference on machine learning (ICML‘04). https://doi.org/10.1145/1015330.1015355
McGovern A, Barto A (2001) Automatic discovery of subgoals in reinforcement learning using diverse density. In: Proceedings of the eighteenth international conference on machine learning (ICML’01), pp 361–368
Meyers C (2006) Network flow problems and congestion games: complexity and approximation results. Ph.D. thesis, MIT
Milchtaich I (2004) Social optimality and cooperation in nonatomic congestion games. J Econ Theory 114(1):56–87
Article MathSciNet MATH Google Scholar
Nachum T, Gu SS, Lee H, Levine S (2018) Data-efficient hierarchical reinforcement learning. In: 32nd Conference on neural information processing systems (NeurIPS 2018), pp 3303–3313
Parr R, Russell S (1998) Reinforcement learning with hierarchies of machines. In: Proceedings of the 1997 conference on advances in neural information processing systems (NIPS'97), vol 10, pp 1043–1049
Peng XB, Berseth G, Yin K, van de Panne M (2017) Deeploco: dynamic locomotion skills using hierarchical deep reinforcement learning. ACM Trans Graph 36(4):1–13. https://doi.org/10.1145/3072959.3073602
Article Google Scholar
Penn M, Polukarov M, Tennenholtz M (2011) Congestion games with failures. Discr Appl Math 159(15):1508–1525
Article MathSciNet MATH Google Scholar
Radulescu R, Vrancx P, Nowe A (2017) Analysing congestion problems in multi-agent reinforcement learning. In: Proceedings of the 16th conference on autonomous agents and multiagent systems (AAMAS’17), pp 1705–1707
Rasmussen D, Voelker A, Eliasmith C (2017) A neural model of hierarchical reinforcement learning. PLoS One 12(7):e0180234. https://doi.org/10.1371/journal.pone.0180234
Article Google Scholar
Riemer M, Liu M, Tesauro G (2018) Learning abstract options. In: Proceedings of the 32nd international conference on neural information processing systems, NIPS’18, pp 10445-10455. Curran Associates Inc., Red Hook
Rosenthal RW (1973) A class of games processing pure-strategy nash equilibria. Int J Game Theory 2:65–67
Article MATH Google Scholar
Spatharis C, Blekas K, Bastas A, Kravaris T, Vouros GA (2019) Collaborative multiagent reinforcement learning schemes for air traffic management. In: 10th international conference on information, intelligence, systems and applications (IISA), pp 1–8
Spatharis C, Kravaris T, Vouros GA, Blekas K, Chalkadiakis G, Garcia JMC, Fernández EC (2018) Multiagent reinforcement learning methods to resolve demand capacity balance problems. In: Hellenic A.I. conference (SETN 2018), pp 2:1–2:9
Spatharis C, Kravaris T, Vouros GA, Blekas K, Cordero JMG (2018) Multiagent reinforcement learning methods for resolving demand—capacity imbalances. In: Digital avionics systems conference (DASC’18)
Sutton R, Precup D, Singh S (1999) Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning. Artif Intell 112(1–2):181–211
Article MathSciNet MATH Google Scholar
Tessler C, Givony S, Zahavy T, Mankowitz DJ, Mannor S (2017) A deep hierarchical approach to lifelong learning in minecraft. In: Proceedings of the thirty-first AAAI conference on artificial intelligence (AAAI '17), pp 1553–1561
Tumer K, Welch Z, Agogino A (2008) Aligning social welfare and agent preferences to alleviate traffic congestion. In: Proceedings of the 7th international joint conference on autonomous agents and multiagent systems (AAMAS’08), vol 2, pp 655–662

Download references

Acknowledgements

This work has been partially supported by the National Matching Funds 2017-2018 of the Greek Government and more specifically by the General Secretariat for Research and Technology (GSRT), related to DART(www.dart-research.eu) project. The major part of this work has been completed during the DART project where authors participated as members of the University of Piraeus Research Center group. We would like also to acknowledge the contribution of CRIDA for providing the data during the DART project.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Ioannina, Ioannina, Greece
Christos Spatharis & Konstantinos Blekas
Department of Digital Systems, University of Piraeus, Piraeus, Greece
Alevizos Bastas, Theocharis Kravaris & George A. Vouros
CRIDA, Madrid, Spain
Jose Manuel Cordero

Authors

Christos Spatharis
View author publications
You can also search for this author in PubMed Google Scholar
Alevizos Bastas
View author publications
You can also search for this author in PubMed Google Scholar
Theocharis Kravaris
View author publications
You can also search for this author in PubMed Google Scholar
Konstantinos Blekas
View author publications
You can also search for this author in PubMed Google Scholar
George A. Vouros
View author publications
You can also search for this author in PubMed Google Scholar
Jose Manuel Cordero
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George A. Vouros.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spatharis, C., Bastas, A., Kravaris, T. et al. Hierarchical multiagent reinforcement learning schemes for air traffic management. Neural Comput & Applic 35, 147–159 (2023). https://doi.org/10.1007/s00521-021-05748-7

Download citation

Received: 30 January 2020
Accepted: 16 January 2021
Published: 10 February 2021
Issue Date: January 2023
DOI: https://doi.org/10.1007/s00521-021-05748-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical multiagent reinforcement learning schemes for air traffic management

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical multiagent reinforcement learning schemes for air traffic management

Abstract

Access this article

Similar content being viewed by others

Multi-agent deep reinforcement learning: a survey

Monte Carlo Tree Search: a review of recent modifications and applications

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation