Approximate Equivalence of Markov Decision Processes

Even-Dar, Eyal; Mansour, Yishay

doi:10.1007/978-3-540-45167-9_42

Eyal Even-Dar⁸ &
Yishay Mansour⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

5401 Accesses
4 Citations

Abstract

We consider the problem of finding the minimal ε-equivalent MDP for an MDP given in its tabular form. We show that the problem is NP-Hard and then give a bicriteria approximation algorithm to the problem. We suggest that the right measure for finding minimal ε-equivalent model is L ₁ rather than L _∞ by giving both an example, which demonstrates the drawback of using L _∞, and performance guarantees for using L ₁. In addition, we give a polynomial algorithm that decides whether two MDPs are equivalent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allender, E., Arora, S., Kearns, M., Moore, C., Russell, A.: Note on the representational incompatabilty of function approximation and factored dynamics. In: Advances in Neural Information Processing Systems 15 (2002)
Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
MATH Google Scholar
Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5(3), 142–150 (1989)
Article Google Scholar
Dean, T., Givan, R., Leach, S.: Model reduction techniques for computing approximately optimal solutions for Markov decision processes. In: UAI, pp. 124–131 (1997)
Google Scholar
Givan, R., Dean, T., Greig, M.: Equivalence notions and model minimization in markov decision processes. Artificial Intelligence (2003) (to appear)
Google Scholar
Gonzalez, T.F.: Clustering to minimize the maximum inter-cluster distance. Theoretical Computer Science 38, 293–306 (1985)
Article MATH MathSciNet Google Scholar
Givan, R., Leach, S., Dean, T.: Bounded parameter markov decision processes. Artificial Intelligence 122, 71–109 (2000)
Article MATH MathSciNet Google Scholar
Lusena, C., Goldsmith, J., Mundhenk, M.: Nonapproximability results for partially observable markov decision processes. Journal of Artificial Intelligence Research 14, 83–103 (2001)
MATH MathSciNet Google Scholar
Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Chichester (1994)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Tel Aviv University, Tel-Aviv, 69978, Israel
Eyal Even-Dar & Yishay Mansour

Authors

Eyal Even-Dar
View author publications
You can also search for this author in PubMed Google Scholar
Yishay Mansour
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

MPI for Biological Cybernetics, Spemannstr. 38, 72076, Tübingen, Germany
Bernhard Schölkopf
University of California, Santa Cruz
Manfred K. Warmuth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Even-Dar, E., Mansour, Y. (2003). Approximate Equivalence of Markov Decision Processes. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-540-45167-9_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40720-1
Online ISBN: 978-3-540-45167-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics