Visualizing Deep Q-Learning to Understanding Behavior of Swarm Robotic System

  • Xiaotong NieEmail author
  • Motoaki HiragaEmail author
  • Kazuhiro OhkuraEmail author
Conference paper
Part of the Proceedings in Adaptation, Learning and Optimization book series (PALO, volume 12)


Swarm robotic systems (SRS) are a type of multi-robot systems that consist of many homogeneous autonomous robots inspired by social insects. In our pervious study, we succeeded in developing end-to-end control policies for SRS using Deep Q-Network (DQN) algorithm. However, since DQN is totally a black box, it is difficult to understand what were learnt through the learning process. Therefore, in this paper, a novel method of visualizing the decision making process in the DQN is proposed by combining Deconvolutional Network (Deconvnet) and Gradient-weighted Class Activation Mapping (Grad-CAM). Then we show what are being preserved as the deep features and which part of input image is concerned to make an action decision. The proposed method is demonstrated by conducting the computer simulations of a round trip task, in which the swarm robots need to visit two different locations alternatively as many times as possible. The computer simulations might also be explained that the proposed method visualizes the policies learned by DQN.


Deep Q-Network Deconvnet Grad-CAM Swarm Robotic System 


  1. 1.
    Sahin, E.: Swarm robotics: from sources of inspiration to domains of application. In: International Workshop on Swarm Robotics. LNCS, vol. 3342, pp. 10–20 (2004)CrossRefGoogle Scholar
  2. 2.
    Francesca, G., Brambilla, M., Trianni, V., Dorigo, M., Birattari, M.: Analysing an evolved robotic behaviour using a biological model of collegial decision making. In: International Conference on Simulation of Adaptive Behavior, pp. 381–390 (2012)CrossRefGoogle Scholar
  3. 3.
    Brambilla, M., Ferrante, E., Birattari, M., Dorigo, M.: Swarmrobotics: a review from the swarm engineering perspective. Swarm Intell. 7(1), 1–41 (2013)CrossRefGoogle Scholar
  4. 4.
    Wei, Y., Nie, X., Hiraga, M., Ohkura, K., Car, Z.: Developing end-to-end control policies for robotic swarms using deep Q-learning. J. Adv. Comput. Intell. Intell. Inf. 23, 920–927 (2019)CrossRefGoogle Scholar
  5. 5.
    Gunning, D.: Explainable artificial intelligence (XAI). Defense Advanced Research Projects Agency (DARPA) (2017)Google Scholar
  6. 6.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: European Conference on Computer Vision (ECCV), pp. 818–833 (2014)CrossRefGoogle Scholar
  7. 7.
    Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., Torralba, A.: Learning deep features for discriminative localization. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2921–2929 (2016)Google Scholar
  8. 8.
    Lin, M., Chen, Q., Yan, S.: Network in network. In: International Conference on Learning Representations (ICLR) (2014)Google Scholar
  9. 9.
    Selvaraju, R.R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., Batra, D.: Grad-CAM: why did you say that? Visual explanations from deep networks via gradient-based localization. arXiv preprint arXiv:1610.02391 (2016)
  10. 10.
    Fong, R.C., Vedaldi, A.: Interpretable explanations of black boxes by meaningful perturbation. arXiv preprint arXiv:1704.03296 (2017)
  11. 11.
    Zahavy, T., Zrihem, N.B., Mannor, S.: Graying the black box: understanding DQNs. In: International Conference on Machine Learning (ICML), pp. 1899–1908 (2016)Google Scholar
  12. 12.
    Van der Maaten, L., Hinton, G.E.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
  13. 13.
    Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.: Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013)
  14. 14.
    Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Graduate Shool EngineeringHiroshima UniversityHiroshimaJapan

Personalised recommendations