Advertisement

Planning in Partially Observable Domains with Fuzzy Epistemic States and Probabilistic Dynamics

  • Nicolas DrougardEmail author
  • Didier Dubois
  • Jean-Loup Farges
  • Florent Teichteil-Königsbuch
Conference paper
  • 393 Downloads
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9310)

Abstract

A new translation from Partially Observable MDP into Fully Observable MDP is described here. Unlike the classical translation, the resulting problem state space is finite, making MDP solvers able to solve this simplified version of the initial partially observable problem: this approach encodes agent beliefs with possibility distributions over states, leading to an MDP whose state space is a finite set of epistemic states. After a short description of the POMDP framework as well as notions of Possibility Theory, the translation is described in a formal manner with semantic arguments. Then actual computations of this transformation are detailed, in order to highly benefit from the factored structure of the initial POMDP in the final MDP size reduction and structure. Finally size reduction and tractability of the resulting MDP is illustrated on a simple POMDP problem.

Keywords

Markov Decision Process Epistemic State Possibility Distribution Partially Observable Markov Decision Process Observation Variable 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bellman, R.: A Markovian decision process. Indiana Univ. Math. J. 6, 679–684 (1957)zbMATHCrossRefGoogle Scholar
  2. 2.
    Boutilier, C., Poole, D.: Computing optimal policies for partially observable decision processes using compact representations. In: Proceedings of the 13th National Conference on Artificial Intelligence, AAAI 1996, Portland, Oregon, vol. 2, pp. 1168–1175 (1996). http://www.aaai.org/Library/AAAI/1996/aaai96-173.php
  3. 3.
    Cassandra, A., Littman, M.L., Zhang, N.L.: Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, pp. 54–61. Morgan Kaufmann Publishers (1997)Google Scholar
  4. 4.
    De Cooman, G.: Integration and conditioning in numerical possibility theory. Ann. Math. Artif. Intell. 32(1–4), 87–123 (2001)MathSciNetCrossRefGoogle Scholar
  5. 5.
    De Finetti, B.: Theory of Probability: A Critical Introductory Treatment. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1974)zbMATHGoogle Scholar
  6. 6.
    Drougard, N., Teichteil-Königsbuch, F., Farges, J.L., Dubois, D.: Qualitative possibilistic mixed-observable MDPs. In: Proceedings of 29th Conference on Uncertainty in Artificial Intelligence, UAI 2013, pp. 192–201. AUAI Press, Oregon (2013)Google Scholar
  7. 7.
    Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Structured possibilistic planning using decision diagrams. In: Proceedings of 28th AAAI Conference on Artificial Intelligence, Québec City, Canada, pp. 2257–2263 (2014). http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8553
  8. 8.
    Dubois, D.: Possibility theory and statistical reasoning. Comput. Stat. Data Anal. 51, 47–69 (2006)zbMATHCrossRefGoogle Scholar
  9. 9.
    Dubois, D., Prade, H.: The logical view of conditioning and its application to possibility and evidence theories. Int. J. Approximate Reasoning 4(1), 23–46 (1990). http://www.sciencedirect.com/science/article/pii/0888613X9090007O zbMATHMathSciNetCrossRefGoogle Scholar
  10. 10.
    Dubois, D., Prade, H., Sandri, S.: On possibility/probability transformations. In: Proceedings of the 4th IFSA Conference, pp. 103–112. Kluwer Academic Publiction (1993)Google Scholar
  11. 11.
    Ong, S., Png, S., Hsu, D., Lee, W.: Planning under uncertainty for robotic tasks with mixed observability. Int. J. Rob. Res. 29(8), 1053–1068 (2010)CrossRefGoogle Scholar
  12. 12.
    Papadimitriou, C., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)zbMATHMathSciNetCrossRefGoogle Scholar
  13. 13.
    Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. Wiley, New York (1994)zbMATHCrossRefGoogle Scholar
  14. 14.
    Sabbadin, R.: A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, UAI 1999. Morgan Kaufmann Publishers Inc., San Francisco (1999)Google Scholar
  15. 15.
    Sanner, S.: Probabilistic track of the 2011 international planning competition (2011). http://users.cecs.anu.edu.au/~ssanner/IPPC_2011
  16. 16.
    Silver, D., Veness, J.: Monte-carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems, Vancouver, Canada, vol. 23, pp. 2164–2172 (2010)Google Scholar
  17. 17.
    Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon, vol. 21. INFORMS (1973)Google Scholar
  18. 18.
    Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI 2004, pp. 520–527. AUAI Press, Arlington (2004)Google Scholar
  19. 19.
    Zadeh, L.A.: Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Comput. 2(1), 23–25 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Nicolas Drougard
    • 1
    Email author
  • Didier Dubois
    • 2
  • Jean-Loup Farges
    • 1
  • Florent Teichteil-Königsbuch
    • 1
  1. 1.Onera – The French Aerospace LabToulouseFrance
  2. 2.IRIT, CNRS and Université de ToulouseToulouseFrance

Personalised recommendations