Anytime Algorithms for Solving Possibilistic MDPs and Hybrid MDPs

  • Kim BautersEmail author
  • Weiru Liu
  • Lluís Godo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9616)


The ability of an agent to make quick, rational decisions in an uncertain environment is paramount for its applicability in realistic settings. Markov Decision Processes (MDP) provide such a framework, but can only model uncertainty that can be expressed as probabilities. Possibilistic counterparts of MDPs allow to model imprecise beliefs, yet they cannot accurately represent probabilistic sources of uncertainty and they lack the efficient online solvers found in the probabilistic MDP community. In this paper we advance the state of the art in three important ways. Firstly, we propose the first online planner for possibilistic MDP by adapting the Monte-Carlo Tree Search (MCTS) algorithm. A key component is the development of efficient search structures to sample possibility distributions based on the DPY transformation as introduced by Dubois, Prade, and Yager. Secondly, we introduce a hybrid MDP model that allows us to express both possibilistic and probabilistic uncertainty, where the hybrid model is a proper extension of both probabilistic and possibilistic MDPs. Thirdly, we demonstrate that MCTS algorithms can readily be applied to solve such hybrid models.


Markov Decision Process (MDP) Monte Carlo Tree Search (MCTS) MCTS Algorithm Possibility Distribution Possibilistic Counterpart 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work is partially funded by EPSRC PACES project (Ref: EP/J012149/1). Special thanks to Steven Schockaert who read an early version of the paper and provided invaluable feedback. We also like to thank the reviewers for taking the time to read the paper in detail and provide feedback that helped to further improve the quality of the paper.


  1. 1.
    Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2–3), 235–256 (2002)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bellman, R.: A Markovian decision process. Indiana Univ. Math. J. 6, 679–684 (1957)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Qualitative possibilistic mixed-observable MDPs. In: Proceedings of the 29th Conference on Uncertainty in Artificial Intelligence (UAI 2013) (2013)Google Scholar
  4. 4.
    Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Structured possibilistic planning using decision diagrams. In: Proceedings of the 28th AI Conference on Artificial Intelligence (AAAI 2014), pp. 2257–2263 (2014)Google Scholar
  5. 5.
    Dubois, D., Prade, H.: On several representations of an uncertain body of evidence. In: Gupta, M.M., Sanchez, E. (eds.) Fuzzy Information and Decision Processes, pp. 167–181. North-Holland, Amsterdam (1982)Google Scholar
  6. 6.
    Dubois, D., Prade, H.: Unfair coins and necessity measures: towards a possibilistic interpretation of histograms. Fuzzy Sets Syst. 10(1), 15–20 (1983)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Dubois, D., Prade, H.: Possibility theory and its application: where do we stand? Mathware Soft Comput. 18(1), 18–31 (2011)MathSciNetGoogle Scholar
  8. 8.
    Dubois, D., Prade, H., Sandri, S.: On possibility/probability transformation. In: Proceedings of the 4th International Fuzzy Systems Association Congress (IFSA 1991), pp. 50–53 (1991)Google Scholar
  9. 9.
    Dubois, D., Prade, H., Smets, P.: New semantics for quantitative possibility theory. In: Benferhat, S., Besnard, P. (eds.) ECSQARU 2001. LNCS (LNAI), vol. 2143, pp. 410–421. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  10. 10.
    Kaufmann, A.: La simulation des sous-ensembles flous. In: Table Ronde CNRS-Quelques Applications Concrètes Utilisant les Derniers Perfectionnements de la Théorie du Flou (1980)Google Scholar
  11. 11.
    Kearns, M., Mansour, Y., Ng, A.: A sparse sampling algorithm for near-optimal planning in large Markov decision processes. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI 1999), pp. 1324–1231 (1999)Google Scholar
  12. 12.
    Keller, T., Eyerich, P.: PROST: probabilistic planning based on UCT. In: Proceedings of the 22nd International Conference on Automated Planning and Scheduling (ICAPS 2012) (2012)Google Scholar
  13. 13.
    Klir, G.: A principle of uncertainty and information invariance. Int. J. Gen. Syst. 17(2–3), 249–275 (1990)CrossRefzbMATHGoogle Scholar
  14. 14.
    Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 282–293. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  15. 15.
    Kolobov, A., Mausam, Weld, D.: LRTDP versus UCT for online probabilistic planning. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI 2012) (2012)Google Scholar
  16. 16.
    Rao, A., Georgeff, M.: Modeling rational agents within a BDI-architecture. In: Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR 1991), pp. 473–484 (1991)Google Scholar
  17. 17.
    Sabbadin, R.: A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI 1999), pp. 567–574 (1999)Google Scholar
  18. 18.
    Sabbadin, R., Fargier, H., Lang, J.: Towards qualitative approaches to multi-stage decision making. Int. J. Approximate Reasoning 19(3), 441–471 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Shafer, G., et al.: A Mathematical Theory of Evidence. Princeton University Press, Princeton (1976)zbMATHGoogle Scholar
  20. 20.
    Smets, P.: Constructing the pignistic probability function in a context of uncertainty. In: Proceedings of the 5th Annual Conference on Uncertainty in Artificial Intelligence (UAI 1989), pp. 29–40 (1989)Google Scholar
  21. 21.
    Vose, M.: A linear algorithm for generating random numbers with a given distribution. IEEE Trans. Softw. Eng. 17(9), 972–975 (1991)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Yager, R.: Level Sets for Membership Evaluation of Fuzzy Subset, in Fuzzy Sets and Possibility Theory - Recent Developments, pp. 90–97. Pergamon Press, NewYork (1982)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Queen’s University BelfastBelfastUK
  2. 2.IIIA, CSICBellaterraSpain

Personalised recommendations