Learning Abstract Planning Domains and Mappings to Real World Perceptions

  • Luciano SerafiniEmail author
  • Paolo Traverso
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11946)


Most of the works on planning and learning, e.g., planning by (model based) reinforcement learning, are based on two main assumptions: (i) the set of states of the planning domain is fixed; (ii) the mapping between the observations from the real word and the states is implicitly assumed, and is not part of the planning domain. Consequently, the focus is on learning the transitions between states. Current approaches address neither the problem of learning new states of the planning domain, nor the problem of representing and updating the mapping between the real world perceptions and the states. In this paper, we drop such assumptions. We provide a formal framework in which (i) the agent can learn dynamically new states of the planning domain; (ii) the mapping between abstract states and the perception from the real world, represented by continuous variables, is part of the planning domain; (iii) such mapping is learned and updated along the “life” of the agent. We define and develop an algorithm that interleaves planning, acting, and learning. We provide a first experimental evaluation that shows how this novel framework can effectively learn coherent abstract planning models.


  1. 1.
    Abbeel, P., Quigley, M., Ng, A.Y.: Using inaccurate models in reinforcement learning. In: Machine Learning, Proceedings of the Twenty-Third International Conference, ICML 2006, 25–29 June 2006, Pittsburgh, Pennsylvania, USA, pp. 1–8 (2006)Google Scholar
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer, Heidelberg (2006)zbMATHGoogle Scholar
  3. 3.
    Bogomolov, S., Magazzeni, D., Podelski, A., Wehrle, M.: Planning as model checking in hybrid domains. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 27–31 July 2014, Québec City, Québec, Canada, pp. 2228–2234 (2014)Google Scholar
  4. 4.
    Co-Reyes, J.D., Liu, Y., Gupta, A., Eysenbach, B., Abbeel, P., Levine, S.: Self-consistent trajectory autoencoder: hierarchical reinforcement learning with trajectory embeddings. In: Proceedings of the 35th International Conference on Machine Learning, ICML 2018, 10–15 July 2018, Stockholmsmässan, Stockholm, Sweden, pp. 1008–1017 (2018)Google Scholar
  5. 5.
    Cresswell, S., McCluskey, T.L., West, M.M.: Acquiring planning domain models using LOCM. Knowl. Eng. Rev. 28(2), 195–213 (2013)CrossRefGoogle Scholar
  6. 6.
    Geffner, H., Bonet, B.: A Concise Introduction to Models and Methods for Automated Planning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, San Rafael (2013)CrossRefGoogle Scholar
  7. 7.
    Ghallab, M., Nau, D.S., Traverso, P.: Automated Planning - Theory and Practice. Elsevier, Hoboken (2004)zbMATHGoogle Scholar
  8. 8.
    Ghallab, M., Nau, D.S., Traverso, P.: Automated Planning and Acting. Cambridge University Press, Cambridge (2016)zbMATHGoogle Scholar
  9. 9.
    Gregory, P., Cresswell, S.: Domain model acquisition in the presence of static relations in the LOP system. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, 9–15 July 2016, New York, NY, USA, pp. 4160–4164 (2016)Google Scholar
  10. 10.
    Henaff, M., Whitney, W.F., LeCun, Y.: Model-Based Planning with Discrete and Continuous Actions. ArXiv e-prints (2017)Google Scholar
  11. 11.
    Henaff, M., Whitney, W.F., LeCun, Y.: Model-based planning in discrete action spaces. CoRR abs/1705.07177 (2017)Google Scholar
  12. 12.
    Ingrand, F., Ghallab, M.: Deliberation for autonomous robots: a survey. Artif. Intell. 247, 10–44 (2017)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: a survey. J. Artif. Intell. Res. 4, 237–285 (1996)CrossRefGoogle Scholar
  14. 14.
    Leonetti, M., Iocchi, L., Stone, P.: A synthesis of automated planning and reinforcement learning for efficient, robust decision-making. Artif. Intell. 241, 103–130 (2016)MathSciNetCrossRefGoogle Scholar
  15. 15.
    McCluskey, T.L., Cresswell, S., Richardson, N.E., West, M.M.: Automated acquisition of action knowledge. In: ICAART 2009 - Proceedings of the International Conference on Agents and Artificial Intelligence, 19–21 January 2009, Porto, Portugal, pp. 93–100 (2009)Google Scholar
  16. 16.
    Mehta, N., Tadepalli, P., Fern, A.: Autonomous learning of action models for planning. In: Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, 12–14 December 2011, Granada, Spain, pp. 2465–2473 (2011)Google Scholar
  17. 17.
    Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)CrossRefGoogle Scholar
  18. 18.
    Mourão, K., Zettlemoyer, L.S., Petrick, R.P.A., Steedman, M.: Learning STRIPS operators from noisy and incomplete observations. In: Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, 14–18 August 2012, Catalina Island, CA, USA, pp. 614–623 (2012)Google Scholar
  19. 19.
    Parr, R., Russell, S.J.: Reinforcement learning with hierarchies of machines. In: Advances in Neural Information Processing Systems 10, [NIPS Conference, Denver, Colorado, USA, 1997], pp. 1043–1049 (1997)Google Scholar
  20. 20.
    Ryan, M.R.K.: Using abstract models of behaviours to automatically generate reinforcement learning hierarchies. In: Machine Learning, Proceedings of the Nineteenth International Conference (ICML 2002), University of New South Wales, 8–12 July 2002, Sydney, Australia, pp. 522–529 (2002)Google Scholar
  21. 21.
    Sutton, R.S.: Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In: Machine Learning, Proceedings of the Seventh International Conference on Machine Learning, 21–23 June 1990, Austin, Texas, USA, pp. 216–224 (1990)Google Scholar
  22. 22.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning - An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)Google Scholar
  23. 23.
    Yang, F., Lyu, D., Liu, B., Gustafson, S.: PEORL: integrating symbolic planning and hierarchical reinforcement learning for robust decision-making. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, 13–19 July 2018, Stockholm, Sweden, pp. 4860–4866 (2018)Google Scholar
  24. 24.
    Zhuo, H.H., Kambhampati, S.: Action-model acquisition from noisy plan traces. In: IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, 3–9 August 2013, Beijing, China, pp. 2444–2450 (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Fondazione Bruno KesslerTrentoItaly

Personalised recommendations