Agent Based Simulation of Network Routing: Reinforcement Learning Comparison

  • Krešimir Čunko
  • Marin Vuković
  • Dragan JevtićEmail author
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 96)


The paper considers and compares two methods applicable for self-adaptive routing in communication networks based on immobile agent. Two different reinforcement learning algorithms, Q-learning and SARSA, were employed in the simulated environment and results were gathered and compared. Since the task of routing is to find the optimal path between source and destination for every information piece of the service, the critical moment for routing in communication networks is quality of service, which includes coordination and support by many dislocated devices. These devices change their properties in time, they can appear and they fall down. Thus the task of the agents is to learn to predict new situations while continuously operates, i.e. to self-adapt its function. Our experiments show that the SARSA agent outperforms Q agent in information routing but in some situations, both agents fall. The circumstances in agent environment for which the agents are not prestigious were detected and depicted.


Reinforcement Learning Q and SARSA learning Agent Routing 


  1. 1.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning – An Introduction. MIT Press, Cambridge (1998)Google Scholar
  2. 2.
    Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8, 55–68 (1992)zbMATHGoogle Scholar
  3. 3.
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, MIT, Belmont (1996)zbMATHGoogle Scholar
  4. 4.
    Takadama, K., Fujita, H.: Toward guidelines for modeling learning agents in multiagent-based simulation: implications from Q-learning and Sarsa agents. In: MABS 2004, Conference Proceedings, pp. 159–172 (2004)CrossRefGoogle Scholar
  5. 5.
    Farahnakian, F., Ebrahimi, M., Daneshtalab, M., Liljeberg, P., Plosila, J.: Q-learning based congestion-aware routing algorithm for on-chip network. IEEE Xplore (2011)Google Scholar
  6. 6.
    Zhao, D., Wang, H.: Deep reinforcement learning with experience replay based on SARSA. IEEE Xplore (2017)Google Scholar
  7. 7.
    Wang, Y.-H., Li, T.-H.S., Lin, C.-J.: Backward Q-learning: the combination of Sarsa algorithm and Q-learning. J. Eng. Appl. Artif. Intell. 26, 2184–2193 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2019

Authors and Affiliations

  • Krešimir Čunko
    • 2
  • Marin Vuković
    • 1
  • Dragan Jevtić
    • 1
    Email author
  1. 1.Faculty of Electrical Engineering and ComputingUniversity of ZagrebZagrebCroatia
  2. 2.Erste Group Card ProcessorZagrebCroatia

Personalised recommendations