Skip to main content

Approximate Equivalence of Markov Decision Processes

  • Conference paper
Learning Theory and Kernel Machines

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2777))

Abstract

We consider the problem of finding the minimal ε-equivalent MDP for an MDP given in its tabular form. We show that the problem is NP-Hard and then give a bicriteria approximation algorithm to the problem. We suggest that the right measure for finding minimal ε-equivalent model is L 1 rather than L  ∞  by giving both an example, which demonstrates the drawback of using L  ∞ , and performance guarantees for using L 1. In addition, we give a polynomial algorithm that decides whether two MDPs are equivalent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allender, E., Arora, S., Kearns, M., Moore, C., Russell, A.: Note on the representational incompatabilty of function approximation and factored dynamics. In: Advances in Neural Information Processing Systems 15 (2002)

    Google Scholar 

  2. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)

    MATH  Google Scholar 

  3. Dean, T., Kanazawa, K.: A model for reasoning about persistence and causation. Computational Intelligence 5(3), 142–150 (1989)

    Article  Google Scholar 

  4. Dean, T., Givan, R., Leach, S.: Model reduction techniques for computing approximately optimal solutions for Markov decision processes. In: UAI, pp. 124–131 (1997)

    Google Scholar 

  5. Givan, R., Dean, T., Greig, M.: Equivalence notions and model minimization in markov decision processes. Artificial Intelligence (2003) (to appear)

    Google Scholar 

  6. Gonzalez, T.F.: Clustering to minimize the maximum inter-cluster distance. Theoretical Computer Science 38, 293–306 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  7. Givan, R., Leach, S., Dean, T.: Bounded parameter markov decision processes. Artificial Intelligence 122, 71–109 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  8. Lusena, C., Goldsmith, J., Mundhenk, M.: Nonapproximability results for partially observable markov decision processes. Journal of Artificial Intelligence Research 14, 83–103 (2001)

    MATH  MathSciNet  Google Scholar 

  9. Puterman, M.: Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons, Chichester (1994)

    Google Scholar 

  10. Sutton, R., Barto, A.: Reinforcement Learning. MIT Press, Cambridge (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Even-Dar, E., Mansour, Y. (2003). Approximate Equivalence of Markov Decision Processes. In: Schölkopf, B., Warmuth, M.K. (eds) Learning Theory and Kernel Machines. Lecture Notes in Computer Science(), vol 2777. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45167-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45167-9_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40720-1

  • Online ISBN: 978-3-540-45167-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics