Abstract
A new translation from Partially Observable MDP into Fully Observable MDP is described here. Unlike the classical translation, the resulting problem state space is finite, making MDP solvers able to solve this simplified version of the initial partially observable problem: this approach encodes agent beliefs with possibility distributions over states, leading to an MDP whose state space is a finite set of epistemic states. After a short description of the POMDP framework as well as notions of Possibility Theory, the translation is described in a formal manner with semantic arguments. Then actual computations of this transformation are detailed, in order to highly benefit from the factored structure of the initial POMDP in the final MDP size reduction and structure. Finally size reduction and tractability of the resulting MDP is illustrated on a simple POMDP problem.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bellman, R.: A Markovian decision process. Indiana Univ. Math. J. 6, 679–684 (1957)
Boutilier, C., Poole, D.: Computing optimal policies for partially observable decision processes using compact representations. In: Proceedings of the 13th National Conference on Artificial Intelligence, AAAI 1996, Portland, Oregon, vol. 2, pp. 1168–1175 (1996). http://www.aaai.org/Library/AAAI/1996/aaai96-173.php
Cassandra, A., Littman, M.L., Zhang, N.L.: Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes. In: Proceedings of the 13th Conference on Uncertainty in Artificial Intelligence, pp. 54–61. Morgan Kaufmann Publishers (1997)
De Cooman, G.: Integration and conditioning in numerical possibility theory. Ann. Math. Artif. Intell. 32(1–4), 87–123 (2001)
De Finetti, B.: Theory of Probability: A Critical Introductory Treatment. Wiley Series in Probability and Mathematical Statistics. Wiley, New York (1974)
Drougard, N., Teichteil-Königsbuch, F., Farges, J.L., Dubois, D.: Qualitative possibilistic mixed-observable MDPs. In: Proceedings of 29th Conference on Uncertainty in Artificial Intelligence, UAI 2013, pp. 192–201. AUAI Press, Oregon (2013)
Drougard, N., Teichteil-Königsbuch, F., Farges, J., Dubois, D.: Structured possibilistic planning using decision diagrams. In: Proceedings of 28th AAAI Conference on Artificial Intelligence, Québec City, Canada, pp. 2257–2263 (2014). http://www.aaai.org/ocs/index.php/AAAI/AAAI14/paper/view/8553
Dubois, D.: Possibility theory and statistical reasoning. Comput. Stat. Data Anal. 51, 47–69 (2006)
Dubois, D., Prade, H.: The logical view of conditioning and its application to possibility and evidence theories. Int. J. Approximate Reasoning 4(1), 23–46 (1990). http://www.sciencedirect.com/science/article/pii/0888613X9090007O
Dubois, D., Prade, H., Sandri, S.: On possibility/probability transformations. In: Proceedings of the 4th IFSA Conference, pp. 103–112. Kluwer Academic Publiction (1993)
Ong, S., Png, S., Hsu, D., Lee, W.: Planning under uncertainty for robotic tasks with mixed observability. Int. J. Rob. Res. 29(8), 1053–1068 (2010)
Papadimitriou, C., Tsitsiklis, J.N.: The complexity of Markov decision processes. Math. Oper. Res. 12(3), 441–450 (1987)
Puterman, M.L.: Markov Decision Processes: Discrete Stochastic Dynamic Programming, 1st edn. Wiley, New York (1994)
Sabbadin, R.: A possibilistic model for qualitative sequential decision problems under uncertainty in partially observable environments. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, UAI 1999. Morgan Kaufmann Publishers Inc., San Francisco (1999)
Sanner, S.: Probabilistic track of the 2011 international planning competition (2011). http://users.cecs.anu.edu.au/~ssanner/IPPC_2011
Silver, D., Veness, J.: Monte-carlo planning in large POMDPs. In: Advances in Neural Information Processing Systems, Vancouver, Canada, vol. 23, pp. 2164–2172 (2010)
Smallwood, R.D., Sondik, E.J.: The optimal control of partially observable Markov processes over a finite horizon, vol. 21. INFORMS (1973)
Smith, T., Simmons, R.: Heuristic search value iteration for POMDPs. In: Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI 2004, pp. 520–527. AUAI Press, Arlington (2004)
Zadeh, L.A.: Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems. Soft Comput. 2(1), 23–25 (1998)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Drougard, N., Dubois, D., Farges, JL., Teichteil-Königsbuch, F. (2015). Planning in Partially Observable Domains with Fuzzy Epistemic States and Probabilistic Dynamics. In: Beierle, C., Dekhtyar, A. (eds) Scalable Uncertainty Management. SUM 2015. Lecture Notes in Computer Science(), vol 9310. Springer, Cham. https://doi.org/10.1007/978-3-319-23540-0_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-23540-0_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23539-4
Online ISBN: 978-3-319-23540-0
eBook Packages: Computer ScienceComputer Science (R0)