Feature Selection for Value Function Approximation Using Bayesian Model Selection

Jung, Tobias; Stone, Peter

doi:10.1007/978-3-642-04180-8_60

Tobias Jung²² &
Peter Stone²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5781))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2586 Accesses
3 Citations

Abstract

Feature selection in reinforcement learning (RL), i.e. choosing basis functions such that useful approximations of the unkown value function can be obtained, is one of the main challenges in scaling RL to real-world applications. Here we consider the Gaussian process based framework GPTD for approximate policy evaluation, and propose feature selection through marginal likelihood optimization of the associated hyperparameters. Our approach has two appealing benefits: (1) given just sample transitions, we can solve the policy evaluation problem fully automatically (without looking at the learning task, and, in theory, independent of the dimensionality of the state space), and (2) model selection allows us to consider more sophisticated kernels, which in turn enable us to identify relevant subspaces and eliminate irrelevant state variables such that we can achieve substantial computational savings and improved prediction performance.

Download to read the full chapter text

Chapter PDF

Regularized feature selection in reinforcement learning

Article 14 July 2015

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Sparse Approximations to Value Functions in Reinforcement Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Bach, F.R., Jordan, M.I.: Kernel independent component analysis. JMLR 3, 1–48 (2002)
MathSciNet MATH Google Scholar
Bach, F.R., Jordan, M.I.: Predictive low-rank decomposition for kernel methods. In: Proc. of ICML, vol. 22 (2005)
Google Scholar
Bertsekas, D.: Dynamic programming and Optimal Control, vol. II. Athena Scientific (2007)
Google Scholar
Csató, L., Opper, M.: Sparse online Gaussian processes. Neural Computation 14(3), 641–668 (2002)
Article MATH Google Scholar
Deisenroth, M.P., Rasmussen, C.E., Peters, J.: Gaussian process dynamic programming. Neurocomputing 72(7-9), 1508–1524 (2009)
Article Google Scholar
Engel, Y., Mannor, S., Meir, R.: Reinforcement learning with Gaussian processes. In: Proc. of ICML, vol. 22 (2005)
Google Scholar
Fine, S., Scheinberg, K.: Efficient SVM training using low-rank kernel representation. JMLR 2, 243–264 (2001)
MATH Google Scholar
Jung, T., Polani, D.: Learning robocup-keepaway with kernels. In: JMLR: Workshop and Conference Proceedings (Gaussian Processes in Practice), vol. 1, pp. 33–57 (2007)
Google Scholar
Keller, P., Mannor, S., Precup, D.: Automatic basis function construction for approximate dynamic programming and reinforcement learning. In: Proc. of ICML, vol. 23 (2006)
Google Scholar
Luo, Z., Wahba, G.: Hybrid adaptive splines. J. Amer. Statist. Assoc. 92, 107–116 (1997)
Article MathSciNet MATH Google Scholar
Mahadevan, S., Maggioni, M.: Proto-value functions: A Laplacian framework for learning representation and control in Markov decision processes. JMLR 8, 2169–2231 (2007)
MathSciNet MATH Google Scholar
Menache, N., Shimkin, N., Mannor, S.: Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research 134, 215–238 (2005)
Article MathSciNet MATH Google Scholar
Nabney, I.T.: Netlab: Algorithms for Pattern Recognition. Springer, Heidelberg (2002)
MATH Google Scholar
Parr, R., Painter-Wakefield, C., Li, L., Littman, M.: Analyzing feature generation for value-function approximation. In: Proc. of ICML, vol. 24 (2007)
Google Scholar
Poggio, T., Girosi, F.: Networks for approximation and learning. Proceedings of the IEEE 78(9), 1481–1497 (1990)
Article MATH Google Scholar
Rasmussen, C.E., Kuss, M.: Gaussian processes in reinforcement learning. In: Advances in Neural Information Processing Systems, vol. 16, pp. 751–759. MIT Press, Cambridge (2004)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
MATH Google Scholar
Reisinger, J., Stone, P., Miikkulainen, R.: Online kernel selection for Bayesian reinforcement learning. In: Proc. of ICML, vol. 25 (2008)
Google Scholar
Seeger, M., Williams, C.K.I., Lawrence, N.: Fast forward selection to speed up sparse Gaussian process regression. In: Proc. of 9th Int’l Workshhop on AI and Statistics. Soc. for AI and Statistics (2003)
Google Scholar
Snelson, E., Ghahramani, Z.: Sparse Gaussian processes using pseudo-inputs. In: NIPS, vol. 18 (2006)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Williams, C., Seeger, M.: Using the Nyström method to speed up kernel machines. In: NIPS, vol. 13, pp. 682–688 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Sciences, University of Texas at Austin, USA
Tobias Jung & Peter Stone

Authors

Tobias Jung
View author publications
You can also search for this author in PubMed Google Scholar
Peter Stone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA, Locked Bag 8001, Canberra, 2601, Australia and Helsinki Institute of IT,, Finland
Wray Buntine
Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik & Dunja Mladenić &
University College London, The Centre for Computational Statistics and Machine Learning Department of Computer Science, Gower St., WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jung, T., Stone, P. (2009). Feature Selection for Value Function Approximation Using Bayesian Model Selection. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04180-8_60

Download citation

DOI: https://doi.org/10.1007/978-3-642-04180-8_60
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04179-2
Online ISBN: 978-3-642-04180-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Feature Selection for Value Function Approximation Using Bayesian Model Selection

Abstract

Chapter PDF

Similar content being viewed by others

Regularized feature selection in reinforcement learning

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Sparse Approximations to Value Functions in Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Feature Selection for Value Function Approximation Using Bayesian Model Selection

Abstract

Chapter PDF

Similar content being viewed by others

Regularized feature selection in reinforcement learning

Novel Feature Selection and Kernel-Based Value Approximation Method for Reinforcement Learning

Sparse Approximations to Value Functions in Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation