Smarter Sampling in Model-Based Bayesian Reinforcement Learning

Castro, Pablo Samuel; Precup, Doina

doi:10.1007/978-3-642-15880-3_19

Pablo Samuel Castro²³ &
Doina Precup²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6321))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2418 Accesses
4 Citations

Abstract

Bayesian reinforcement learning (RL) is aimed at making more efficient use of data samples, but typically uses significantly more computation. For discrete Markov Decision Processes, a typical approach to Bayesian RL is to sample a set of models from an underlying distribution, and compute value functions for each, e.g. using dynamic programming. This makes the computation cost per sampled model very high. Furthermore, the number of model samples to take at each step has mainly been chosen in an ad-hoc fashion. We propose a principled method for determining the number of models to sample, based on the parameters of the posterior distribution over models. Our sampling method is local, in that we may choose a different number of samples for each state-action pair. We establish bounds on the error in the value function between a random model sample and the mean model from the posterior distribution. We compare our algorithm against state-of-the-art methods and demonstrate that our method provides a better trade-off between performance and running time.

Download to read the full chapter text

Chapter PDF

Offline reinforcement learning with task hierarchies

Article 12 July 2017

Reinforcement Learning

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

References

Asmuth, J., Li, L., Littman, M.L., Nouri, A., Wingate, D.: A Bayesian Sampling Approach to Exploration in Reinforcement Learning. In: Proceedings of The 25th Conference on Uncertainty in Artificial Intelligence, UAI 2009 (2009)
Google Scholar
Auer, P.: Using upper confidence bounds for online learning. In: Proceedings of the 41st Annual Symposium on Foundations of Computer Science (2000)
Google Scholar
Dearden, R., Friedman, N., Andre, D.: Model based Bayesian exploration. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, UAI 1999 (1999)
Google Scholar
Duff, M.: Design for an optimal probe. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 131–138 (2003)
Google Scholar
Engel, Y., Mannor, S., Meir, R.: Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning. In: Proceedings of the Twentieth International Conference on Machine Learning, pp. 154–161 (2003)
Google Scholar
Ferns, N., Panangaden, P., Precup, D.: Metrics for finite Markov decision processes. In: Proceedings of the 20th Annual Conference on Uncertainty in Artificial Intelligence, pp. 162–169 (2004)
Google Scholar
Hinderer, K.: Lipschitz Continuity of Value Functions in Markovian Decision Processes. Mathematical Methods of Operations Research 62, 3–22 (2005)
Article MATH MathSciNet Google Scholar
Kolter, J.Z., Ng, A.Y.: Near-Bayesian Exploration in Polynomial Time. In: Proceedings of the Twenty-Sixth International Conference on Machine Learning (2009)
Google Scholar
Müller, A.: How does the value function of a Markov Decision Process depend on the transition probabilities? Mathematics of Operations Research 22, 872–885 (1997)
Article MATH MathSciNet Google Scholar
Poupart, P., Vlassis, N., Hoey, J., Regan, K.: An analytic solution to discrete Bayesian reinforcement learning. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 697–704 (2006)
Google Scholar
Price, B., Boutilier, C.: A Bayesian approach to imitation in reinforcement learning. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI 2003 (2003)
Google Scholar
Puterman, M.L.: Markov Decision Processes. John Wiley & Sons, New York (1994)
Book MATH Google Scholar
Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach. Prentice-Hall, Englewood Cliffs (1994)
Google Scholar
Smith, J.E., McCardle, K.F.: Structural Properties of Stochastic Dynamic Programs. Operations Research 50, 796–809 (2002)
Article MATH MathSciNet Google Scholar
Strens, M.: A Bayesian Framework for Reinforcement Learning. In: Proceedings of the 17th International Conference on Machine Learning, ICML 2000 (2000)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Wang, T., Lizotte, D., Bowling, M., Schuurmans, D.: Bayesian sparse sampling for on-line reward optimization. In: Procedings of the 22nd International Conference on Machine Learning, ICML 2005 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, McGill University,
Pablo Samuel Castro & Doina Precup

Authors

Pablo Samuel Castro
View author publications
You can also search for this author in PubMed Google Scholar
Doina Precup
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departamento de Matemáticas, Estadística y Computación, Universidad de Cantabria, Avenida de los Castros, s/n, 39071, Santander, Spain
José Luis Balcázar
Yahoo! Research Barcelona, Avinguda Diagonal 177, 08018, Barcelona, Spain
Francesco Bonchi
Yahoo! Research Barcelona, Avinguda Diagnonal 177, 08018, Barcelona, Spain
Aristides Gionis
TAO, CNRS-INRIA-LRI, Université Paris-Sud, 91405, Orsay, France
Michèle Sebag

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castro, P.S., Precup, D. (2010). Smarter Sampling in Model-Based Bayesian Reinforcement Learning. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2010. Lecture Notes in Computer Science(), vol 6321. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15880-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-15880-3_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15879-7
Online ISBN: 978-3-642-15880-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Smarter Sampling in Model-Based Bayesian Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Offline reinforcement learning with task hierarchies

Reinforcement Learning

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Smarter Sampling in Model-Based Bayesian Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Offline reinforcement learning with task hierarchies

Reinforcement Learning

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation