A Hybrid Reinforcement Learning and Cellular Automata Model for Crowd Simulation on the GPU

  • Sergio RuizEmail author
  • Benjamín Hernández
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 979)


We present a GPU-based hybrid model for crowd simulations. The model uses reinforcement learning to guide groups of pedestrians towards a goal while adapting to environmental dynamics, and a cellular automaton to describe individual pedestrians’ interactions. In contrast to traditional multi-agent reinforcement learning methods, our model encodes the learned navigation policy into a navigation map, which is used by the cellular automaton’s update rule to calculate the next simulation step. As a result, reinforcement learning is independent of the number of agents, allowing the simulation of large crowds. Implementation of this model on the GPU allows interactive simulations of several hundreds of pedestrians.


Reinforcement learning Crowd simulation Cellular automata GPU 



This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. We thank NVIDIA for the donation of the Titan X GPU used in this research. Sergio Ruiz would like to thank the Tecnologico de Monterrey Computer Department for its support.


  1. 1.
    NVIDIA Thrust. Accessed 14 May 2018
  2. 2.
    Bandini, S., Mauri, G., Vizzari, G.: Supporting action-at-a-distance in situated cellular agents. Fundamenta Informaticae 69(3), 251–271 (2006)MathSciNetzbMATHGoogle Scholar
  3. 3.
    Banerjee, B., Abukmail, A., Kraemer, L.: Advancing the layered approach to agent-based crowd simulation. In: Proceedings of the 22nd ACM/IEEE/SCS Workshop on the Principles of Advanced and Distributed Simulation (PADS), Rome, Italy, pp. 185–192 (2008)Google Scholar
  4. 4.
    Blue, V., Adler, J.: Emergent fundamental pedestrian flows from cellular automata microsimulation. Transp. Res. Rec. J. Transp. Res. Board 1644(4), 29–36 (1998)CrossRefGoogle Scholar
  5. 5.
    Blue, V.J., Adler, J.L.: Cellular automata microsimulation for modeling bi-directional pedestrian walkways. Transp. Res. Part B Methodol. 35(3), 293–312 (2001)CrossRefGoogle Scholar
  6. 6.
    Burstedde, C., Klauck, K., Schadschneider, A., Zittartz, J.: Simulation of pedestrian dynamics using a two-dimensional cellular automaton. Phys. A Stat. Mech. Appl. 295(3), 507–525 (2001)CrossRefGoogle Scholar
  7. 7.
    Buşoniu, L., Babuška, R., De Schutter, B.: A comprehensive survey of multi-agent reinforcement learning. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 38(2), 156–172 (2008)CrossRefGoogle Scholar
  8. 8.
    Casadiego, L., Pelechano, N.: From one to many: simulating groups of agents with reinforcement learning controllers. In: Brinkman, W.-P., Broekens, J., Heylen, D. (eds.) IVA 2015. LNCS (LNAI), vol. 9238, pp. 119–123. Springer, Cham (2015). Scholar
  9. 9.
    Dijkstra, E.W.: Cooperating sequential processes. In: Hansen, P.B. (ed.) The Origin of Concurrent Programming, pp. 65–138. Springer, New York (2002). Scholar
  10. 10.
    Feliciani, C., Nishinari, K.: An enhanced cellular automata sub-mesh model to study high-density pedestrian crowds. In: El Yacoubi, S., Wąs, J., Bandini, S. (eds.) ACRI 2016. LNCS, vol. 9863, pp. 227–237. Springer, Cham (2016). Scholar
  11. 11.
    Godoy, J., Karamouzas, I., Guy, S.J., Gini, M.: Online learning for multi-agent local navigation. In: The AAMAS-2013 Workshop on Cognitive Agents for Virtual Environments, Saint Paul, Minnesota, USA (2013)Google Scholar
  12. 12.
    Helbing, D., Molnár, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51, 4282–4286 (1995)CrossRefGoogle Scholar
  13. 13.
    Kirchner, A., Klüpfel, H., Nishinari, K., Schadschneider, A., Schreckenberg, M.: Discretization effects and the influence of walking speed in cellular automata models for pedestrian dynamics. J. Stat. Mech. Theor. Exp. 2004(10), P10011 (2004)CrossRefGoogle Scholar
  14. 14.
    Kirchner, A., Schadschneider, A.: Simulation of evacuation processes using a bionics-inspired cellular automaton model for pedestrian dynamics. Phys. A Stat. Mech. Appl. 312(1), 260–276 (2002)CrossRefGoogle Scholar
  15. 15.
    Klüpfel, H., Meyer-König, T., Wahle, J., Schreckenberg, M.: Microscopic simulation of evacuation processes on passenger ships. In: Bandini, S., Worsch, T. (eds.) Theory and practical issues on cellular automata, pp. 63–71. Springer, London (2001). Scholar
  16. 16.
    Koenig, S., Simmons, R.G.: Complexity analysis of real-time reinforcement learning applied to finding shortest paths in deterministic domains. Carnegie Mellon University, Pittsburgh, PA, USA, Technical report (1992)Google Scholar
  17. 17.
    Martinez-Gil, F., Barber, F., Lozano, M., Grimaldo, F., Fernández, F.: A reinforcement learning approach for multiagent navigation. In: Proceedings of the International Conference on Agents and Artificial Intelligence, ICAART 2010, Artificial Intelligence, vol. 1, pp. 607–610. SciTePress (2010). ISBN 978-989-674-021-4
  18. 18.
    Martinez-Gil, F., Lozano, M., Fernández, F.: Multi-agent reinforcement learning for simulating pedestrian navigation. In: Vrancx, P., Knudson, M., Grześ, M. (eds.) ALA 2011. LNCS (LNAI), vol. 7113, pp. 54–69. Springer, Heidelberg (2012). Scholar
  19. 19.
    Martinez-Gil, F., Lozano, M., Fernández, F.: MARL-Ped: a multi-agent reinforcement learning based framework to simulate pedestrian groups. Simul. Model. Pract. Theor. 47(Complete), 259–275 (2014)CrossRefGoogle Scholar
  20. 20.
    Moussaïd, M., Helbing, D., Theraulaz, G.: How simple rules determine pedestrian behavior and crowd disasters. Proc. Nat. Acad. Sci. 108(17), 6884–6888 (2011)CrossRefGoogle Scholar
  21. 21.
    Paris, S., Pettre, J., Donikian, S.: Pedestrian Reactive Navigation for Crowd Simulation: a Predictive Approach. Computer Graphics Forum (2007)Google Scholar
  22. 22.
    Reynolds, C.W.: Flocks, herds and schools: a distributed behavioral model. SIGGRAPH Comput. Graph. 21(4), 25–34 (1987)CrossRefGoogle Scholar
  23. 23.
    Ruiz, S., Hernández, B.: A parallel solver for Markov decision process in crowd simulations. In: 2015 Fourteenth Mexican International Conference on Artificial Intelligence (MICAI), pp. 107–116 (2015)Google Scholar
  24. 24.
    Ruiz, S., Hernández, B.: Procesos de decisión de Markov y microescenarios para navegación y evasión de colisiones para multitudes. Res. Comput. Sci. 74, 103–116 (2014)Google Scholar
  25. 25.
    Ruiz, S., Hernández, B.: Real time markov decision processes for crowd simulation. In: Engel, W. (ed.) GPU Zen, pp. 323–341. Black Cat Publishing (2017)Google Scholar
  26. 26.
    Ruiz, S., Hernández, B., Alvarado, A., Rudomín, I.: Reducing memory requirements for diverse animated crowds. In: Proceedings of Motion on Games, MIG 2013, pp. 55:77–55:86. ACM, New York (2013)Google Scholar
  27. 27.
    Sarmady, S., Haron, F., Talib, A.Z.: Simulating crowd movements using fine grid cellular automata. In: 12th International Conference On Computer Modelling and Simulation (UKSim 2010), pp. 428–433. IEEE (2010)Google Scholar
  28. 28.
    Thalmann, D., Musse, S.R.: Crowd Simulation. Springer, London (2013). Scholar
  29. 29.
    Torrey, L.: Crowd simulation via multi-agent reinforcement learning. In: Proceedings of the Sixth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment. The AAAI Press (2010)Google Scholar
  30. 30.
    Weifeng, F., Lizhong, Y., Weicheng, F.: Simulation of bi-direction pedestrian movement using a cellular automata model. Phys. A Stat. Mech. Appl. 321(3), 633–640 (2003)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Tecnológico de MonterreyMexico CityMexico
  2. 2.Oak Ridge National LaboratoryOak RidgeUSA

Personalised recommendations