Skip to main content

The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes

  • Contributed Papers
  • Conference paper
  • First Online:
Book cover Mathematical Foundations of Computer Science 1997 (MFCS 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1295))

Abstract

A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes.

We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.

Supported in part by the Office of the Vice Chancellor for Research and Graduate Studies at the University of Kentucky, and by the Deutsche Forschungsgemeinschaft (DFG), grant Mu 1226/2-1. Part of the work was done at University of Kentucky.

Supported in part by NSF grant CCR-9315354.

Supported in part by NSF grant 9509603. Portions of the work were performed while at the Institute of Mathematical Sciences, Chennai (Madras), India, and at the Wilhelm-Schickard Institut für Informatik, Universität Tübingen (supported by DFG grant TU 7/117-1).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Allender and M. Ogihara. Relationships among PL, #L, and the determinant. RAIRO Theoretical Informatics and Applications, 30(1):1–21, 1996.

    Google Scholar 

  2. C. ÀIvarez and B. Jenner. A very hard log-space counting class. Theoretical Computer Science, 107:3–30, 1993.

    Article  Google Scholar 

  3. J.L. Balcázar. The complexity of searching implicit graphs. Artificial Intelligence, 86:171–188, 1996.

    Article  Google Scholar 

  4. J.L. Balcázar, A. Lozano, and J. Torán. The complexity of algorithmic problems on succinct instances. In R. Baeza-Yates and U. Manber, editors, Computer Science, pages 351–377. Plenum Press, 1992.

    Google Scholar 

  5. D. Beauquier, D. Burago, and A. Slissenko. On the complexity of finite memory policies for Markov decision processes. In Math. Foundations of Computer Science, pages 191–200. Lecture Notes in Computer Science #969, Springer-Verlag, 1995.

    Google Scholar 

  6. A. Borodin, S. Cook, and N. Pippenger. Parallel computation for well-endowed rings and space-bounded probabilistic machines. Information and Control, 58(13):113–136, 1983.

    Article  Google Scholar 

  7. C. Boutilier and D. Poole. Computing optimal policies for partially observable decision processes using compact representations. In Proc. 13th National Conference on Artificial Intelligence, pages 1168–1175. AAAI Press / MIT Press, 1996.

    Google Scholar 

  8. T. Bylander. The computational complexity of propositional STRIPS planning. Artificial Intelligence, 69:165–204, 1994.

    Article  Google Scholar 

  9. K. Erol, J. Hendler, and D. Nau. Complexity results for hierarchical task-network planning. Annals of Mathematics and Artificial Intelligence, 1996.

    Google Scholar 

  10. K. Erol, D. Nau, and V. S. Subrahmanian. Complexity, decidability and undecidability results for domain-independent planning. Artificial Intelligence, 76:75–88, 1995.

    Article  MathSciNet  Google Scholar 

  11. S. Fenner, L. Fortnow, and S. Kurtz. Gap-definable counting classes. Journal of Computer and System Sciences, 48(1):116–148, 1994.

    Article  Google Scholar 

  12. H. Galperin and A. Wigderson. Succinct representation of graphs. Information and Control, 56:183–198, 1983.

    Article  Google Scholar 

  13. J. Goldsmith, M. Littman, and M. Mundhenk. The complexity of plan existence and evaluation in probabilistic domains. In Proc. 13th Conf. on Uncertainty in AI. Morgan Kaufmann Publishers, 1997.

    Google Scholar 

  14. J. Goldsmith, C. Lusena, and M. Mundhenk. The complexity of deterministically observable finite-horizon Markov decision processes. Technical Report 269-96, University of Kentucky Department of Computer Science, 1996.

    Google Scholar 

  15. H. Jung. On probabilistic time and space. In Proceedings 12th ICALP, pages 281–291. Lecture Notes in Computer Science #194, Springer-Verlag, 1985.

    Google Scholar 

  16. R. Ladner. Polynomial space counting problems. SIAM Journal on Computing, 18:1087–1097, 1989.

    Article  Google Scholar 

  17. M.L. Littman. Probabilistic propositional planning: Representations and complexity. In Proc. 14th National Conference on AI. AAAI Press / MIT Press, 1997.

    Google Scholar 

  18. W.S. Lovejoy. A survey of algorithmic methods for partially observed Markov decision processes. Annals of Operations Research, 28:47–66, 1991.

    Article  Google Scholar 

  19. C.H. Papadimitriou. Computational Complexity. Addison-Wesley, 1994.

    Google Scholar 

  20. C.H. Papadimitriou and J.N. Tsitsiklis. Intractable problems in control theory. SIAM Journal of Control and Optimization, pages 639–654, 1986.

    Google Scholar 

  21. C.H. Papadimitriou and J.N. Tsitsiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441–450, 1987.

    Google Scholar 

  22. M.L. Puterman. Markov decision processes. John Wiley & Sons, New York, 1994.

    Google Scholar 

  23. V. Vinay. Counting auxiliary pushdown automata and semi-unbounded arithmetic circuits. In Proc. 6th Structure in Complexity Theory Conference, pages 270–284. IEEE, 1991.

    Google Scholar 

  24. K. W. Wagner. The complexity of combinatorial problems with succinct input representation. Acta Informatica, 23:325–356, 1986.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Igor Prívara Peter Ružička

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mundhenk, M., Goldsmith, J., Allender, E. (1997). The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes. In: Prívara, I., Ružička, P. (eds) Mathematical Foundations of Computer Science 1997. MFCS 1997. Lecture Notes in Computer Science, vol 1295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0029956

Download citation

  • DOI: https://doi.org/10.1007/BFb0029956

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63437-9

  • Online ISBN: 978-3-540-69547-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics