Solving Safety Problems with Ensemble Reinforcement Learning

  • Leonardo A. FerreiraEmail author
  • Thiago F. dos Santos
  • Reinaldo A. C. Bianchi
  • Paulo E. Santos
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11919)


An agent that learns by interacting with an environment may find unexpected solutions to decision-making problems. This solution can be an improvement over well-known ones, such as new strategies for games, but in some cases the unexpected solution is unwanted and should be avoided for reasons such as safety. This paper proposes a Reinforcement Learning Ensemble Framework called ReLeEF. This framework combines decision making methods to provide a finer grained control of the agent’s behaviour while still letting it learn by interacting with the environment. It has been tested in the safety gridworlds and the results show that it can find optimal solutions while fulfilling safety concerns described for each domain, something that state of the art Deep Reinforcement Learning methods were unable to do.


Reinforcement Learning Ontology Safety 


  1. 1.
    Bougie, N., Cheng, L.K., Ichise, R.: Combining deep reinforcement learning with prior knowledge and reasoning. SIGAPP Appl. Comput. Rev. 18(2), 33–45 (2018)CrossRefGoogle Scholar
  2. 2.
    Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000). Scholar
  3. 3.
    Ferreira, L.A., Bianchi, R.A.C., Santos, P.E., de Mantaras, R.L.: A method for the online construction of the set of states of a Markov decision process using answer set programming. In: Mouhoub, M., Sadaoui, S., Ait Mohamed, O., Ali, M. (eds.) IEA/AIE 2018. LNCS (LNAI), vol. 10868, pp. 3–15. Springer, Cham (2018). Scholar
  4. 4.
    d’Avila Garcez, A.S., Dutra, A.R.R., Alonso, E.: Towards symbolic reinforcement learning with common sense. CoRR abs/1804.08597 (2018)Google Scholar
  5. 5.
    Garnelo, M., Arulkumaran, K., Shanahan, M.: Towards deep symbolic reinforcement learning. In: Deep Reinforcement Learning Workshop at the 30th Conference on Neural Information Processing Systems (2016)Google Scholar
  6. 6.
    Garnelo, M., Shanahan, M.: Reconciling deep learning with symbolic artificial intelligence: representing objects and relations. Curr. Opin. Behav. Sci. 29, 17–23 (2019)CrossRefGoogle Scholar
  7. 7.
    Leike, J., et al.: AI safety gridworlds. CoRR abs/1711.09883 (2017)Google Scholar
  8. 8.
    Leonetti, M., Iocchi, L., Stone, P.: A synthesis of automated planning and reinforcement learning for efficient, robust decision-making. Artif. Intell. 241, 103–130 (2016)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Liang, Y., Machado, M.C., Talvitie, E., Bowling, M.: State of the art control of Atari games using shallow reinforcement learning. In: Proceedings of the 2016 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2016, pp. 485–493. International Foundation for Autonomous Agents and Multiagent Systems, Richland (2016)Google Scholar
  10. 10.
    Lu, K., Zhang, S., Stone, P., Chen, X.: Robot representing and reasoning with knowledge from reinforcement learning. CoRR abs/1809.11074 (2018)Google Scholar
  11. 11.
    Lyu, D., Yang, F., Liu, B., Gustafson, S.: SDRL: interpretable and data-efficient deep reinforcement learning leveraging symbolic planning. CoRR abs/1811.00090 (2018)Google Scholar
  12. 12.
    McCarthy, J.: Elaboration tolerance (1999)Google Scholar
  13. 13.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529 (2015)CrossRefGoogle Scholar
  14. 14.
    Pease, A.: Ontology: A Practical Guide. Articulate Software Press, Angwin (2011)Google Scholar
  15. 15.
    Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484 (2016)CrossRefGoogle Scholar
  16. 16.
    Silver, D., et al.: Mastering the game of go without human knowledge. Nature 550(7676), 354 (2017)CrossRefGoogle Scholar
  17. 17.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. The MIT Press, Cambridge (2018)zbMATHGoogle Scholar
  18. 18.
    Tesauro, G.: Temporal difference learning and TD-gammon. Commun. ACM 38(3), 58–68 (1995)CrossRefGoogle Scholar
  19. 19.
    Watkins, C.J., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  20. 20.
    Yang, F., Lyu, D., Liu, B., Gustafson, S.: PEORL: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. CoRR abs/1804.07779 (2018)Google Scholar
  21. 21.
    Zamani, M.A., Magg, S., Weber, C., Wermter, S.: Deep reinforcement learning using symbolic representation for performing spoken language instructions. In: 2nd Workshop on Behavior Adaptation, Interaction and Learning for Assistive Robotics on Robot and Human Interactive Communication (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Leonardo A. Ferreira
    • 1
    Email author
  • Thiago F. dos Santos
    • 1
  • Reinaldo A. C. Bianchi
    • 1
  • Paulo E. Santos
    • 1
    • 2
  1. 1.Centro Universitário FEISão Bernardo do CampoBrazil
  2. 2.College of Science and EngineeringFlinders UniversityAdelaideAustralia

Personalised recommendations