CBR for State Value Function Approximation in Reinforcement Learning

Gabel, Thomas; Riedmiller, Martin

doi:10.1007/11536406_18

Thomas Gabel²⁰ &
Martin Riedmiller²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3620))

Included in the following conference series:

International Conference on Case-Based Reasoning

1248 Accesses
12 Citations

Abstract

CBR is one of the techniques that can be applied to the task of approximating a function over high-dimensional, continuous spaces. In Reinforcement Learning systems a learning agent is faced with the problem of assessing the desirability of the state it finds itself in. If the state space is very large and/or continuous the availability of a suitable mechanism to approximate a value function – which estimates the value of single states – is of crucial importance. In this paper, we investigate the use of case-based methods to realise that task. The approach we take is evaluated in a case study in robotic soccer simulation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D.: Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms. Journal of Man-Machine Studies 36(2), 267–287 (1992)
Article Google Scholar
Bellman, R.E.: Dynamic Programming. Princeton University Press, USA (1957)
MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Neuro Dynamic Programming. Athena Scientific, USA (1996)
MATH Google Scholar
Burkhard, H.D., Wendler, J., Meinert, T., Myritz, H., Sander, G.: AT Humboldt in RoboCup 1999. In: RoboCup, pp. 542–545 (1999)
Google Scholar
Driessens, K., Ramon, J.: Relational Instance Based Regression for Relational RL. In: Proceedings of ICML 2003, Washington, pp. 123–130. AAAI Press, Menlo Park (2003)
Google Scholar
Forbes, J., Andre, D.: Representations for Learning Control Policies. In: Proceedings of the ICML 2002 Workshop on Development of Representations, The University of New South Wales, pp. 7–14 (2002)
Google Scholar
Gordon, G.J.: Stable Function Approximation in Dynamic Programming. In: Proceedings of ICML 1995, San Francisco, pp. 261–268. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Kelly, J.D., Davis, L.: A Hybrid Genetic Algorithm for Classification. In: Proceedings of the Twefth International Joint Conference on Artificial Intelligence (IJCAI 1991), Sydney, Australia, pp. 645–650. Morgan Kaufmann, San Francisco (1991)
Google Scholar
Kuhlmann, G., Stone, P.: Progress in Learning 3 vs. 2 Keepaway. In: RoboCup-2003: Robot Soccer World Cup VII, Berlin. Springer, Heidelberg (2004)
Google Scholar
Merke, A., Riedmiller, M.: Karlsruhe Brainstromers – A Reinforcement Learning Way to Robotic Soccer II. In: RoboCup 2001: Robot Soccer World Cup (2001)
Google Scholar
Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer Server: A Tool for Research on Multi-Agent Systems. Applied Artificial Intelligence 12(2-3), 233–250 (1998)
Article Google Scholar
Ormoneit, D., Sen, S.: Kernel-Based Reinforcement Learning. Technical Report TR 1999-8, Statistics Institute, Stanford University, USA (1999)
Google Scholar
Peng, J.: Efficient Memory-Based Dynamic Programming. In: 12th International Conference on Machine Learning, USA, pp. 438–446. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Ratitch, B., Precup, D.: Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 347–358. Springer, Heidelberg (2004)
Chapter Google Scholar
Riedmiller, M., Braun, H.: A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. In: Proceedings of the IEEE International Conference on Neural Networks (ICNN), San Francisco, USA, pp. 586–591 (1993)
Google Scholar
Santamaria, J., Sutton, R., Ram, A.: Experiments with RL in Problems with Continuous State and Action Spaces. Adaptive Behavior 6(2), 163–217 (1998)
Article Google Scholar
Smart, W.D., Kaelbling, L.P.: Practical Reinforcement Learning in Continuous Spaces. In: Proceedings of the 17th International Conference on Machine Learning (ICML 2000), San Francisco, USA. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Stahl, A., Gabel, T.: Using Evolution Programs to Learn Local Similarity Measures. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 537–551. Springer, Heidelberg (2003)
Chapter Google Scholar
Stolzenburg, F., Obst, O., Murray, J.: Qualitative Velocity and Ball Interception. In: Advances in AI, 25th German Conference on AI, Aachen, pp. 283–298 (2002)
Google Scholar
Sutton, R.S.: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3, 9–44 (1988)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge (1998)
Google Scholar
Veloso, M., Balch, T., Stone, P., et al.: RoboCup 2001: The Fifth Robotic Soccer World Championships. AI Magazine 1(23), 55–68 (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

Neuroinformatics Group, Department of Mathematics and Computer Science, Institute of Cognitive Science, University of Osnabrück, 49069, Osnabrück, Germany
Thomas Gabel & Martin Riedmiller

Authors

Thomas Gabel
View author publications
You can also search for this author in PubMed Google Scholar
Martin Riedmiller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science & Engineering, Lehigh University, PA 18015, Bethlehem,
Héctor Muñoz-Ávila
Free University of Bozen-Bolzano,
Francesco Ricci

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gabel, T., Riedmiller, M. (2005). CBR for State Value Function Approximation in Reinforcement Learning. In: Muñoz-Ávila, H., Ricci, F. (eds) Case-Based Reasoning Research and Development. ICCBR 2005. Lecture Notes in Computer Science(), vol 3620. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11536406_18

Download citation

DOI: https://doi.org/10.1007/11536406_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28174-0
Online ISBN: 978-3-540-31855-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics