Patching Approximate Solutions in Reinforcement Learning

Kim, Min Sub; Uther, William

doi:10.1007/11871842_27

Min Sub Kim²¹ &
William Uther²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4212))

Included in the following conference series:

European Conference on Machine Learning

5495 Accesses
1 Citations

Abstract

This paper introduces an approach to improving an approximate solution in reinforcement learning by augmenting it with a small overriding patch. Many approximate solutions are smaller and easier to produce than a flat solution, but the best solution within the constraints of the approximation may fall well short of global optimality. We present a technique for efficiently learning a small patch to reduce this gap. Empirical evaluation demonstrates the effectiveness of patching, producing combined solutions that are much closer to global optimality.

Download to read the full chapter text

Chapter PDF

Reinforcement Learning

Fundamental Design Principles for Reinforcement Learning Algorithms

Reinforcement Learning Algorithms: Categorization and Structural Properties

References

Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998)
Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. PhD thesis, King’s College, Oxford (1989)
Google Scholar
Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)
Google Scholar
Korf, R.E.: Real-time heuristic search. Artificial Intelligence 42, 189–211 (1990)
Article MATH Google Scholar
Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artificial Intelligence 72, 81–138 (1995)
Article Google Scholar
Bowling, M., Veloso, M.: Reusing learned policies between similar problems. In: Proceedings of the AI*IA-1998 Workshop on New Trends in Robotics, Padua, Italy (1998)
Google Scholar
Taylor, M.E., Stone, P.: Behavior transfer for value-function-based reinforcement learning. In: The 4th International Joint Conference on Autonomous Agents and Multiagent Systems, pp. 53–59. ACM Press, New York (2005)
Chapter Google Scholar
Kok, J.R., Vlassis, N.: Sparse cooperative Q-learning. In: Proceedings of the 21st International Conference on Machine Learning, pp. 481–488. ACM, New York (2004)
Google Scholar
Kok, J.R., Hoen, P.J., Bakker, B., Vlassis, N.: Utile coordination: Learning interdependencies among cooperative agents. In: IEEE Symposium on Computational Intelligence and Games (2005)
Google Scholar
Boutilier, C., Dean, T., Hanks, S.: Decision-theoretic planning: Structural assumptions and computational leverage. Journal of Artificial Intelligence Research 11, 1–94 (1999)
MATH MathSciNet Google Scholar
Kim, M.S., Uther, W.: Patching approximate solutions in reinforcement learning. Technical Report 0610, School of Computer Science and Engineering, University of New South Wales (2006)
Google Scholar
Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13, 227–303 (2000)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

ARC Centre of Excellence for Autonomous Systems, School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, 2052, Australia
Min Sub Kim
National ICT Australia, Sydney, NSW, 2052, Australia
William Uther

Authors

Min Sub Kim
View author publications
You can also search for this author in PubMed Google Scholar
William Uther
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Engineering Group, Technische Universität Darmstadt,
Johannes Fürnkranz
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer
Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, Germany
Myra Spiliopoulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, M.S., Uther, W. (2006). Patching Approximate Solutions in Reinforcement Learning. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Machine Learning: ECML 2006. ECML 2006. Lecture Notes in Computer Science(), vol 4212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871842_27

Download citation

DOI: https://doi.org/10.1007/11871842_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45375-8
Online ISBN: 978-3-540-46056-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Patching Approximate Solutions in Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Reinforcement Learning

Fundamental Design Principles for Reinforcement Learning Algorithms

Reinforcement Learning Algorithms: Categorization and Structural Properties

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Patching Approximate Solutions in Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Reinforcement Learning

Fundamental Design Principles for Reinforcement Learning Algorithms

Reinforcement Learning Algorithms: Categorization and Structural Properties

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation