Using a case base of surfaces to speed-up reinforcement learning

Drummond, Chris

doi:10.1007/3-540-63233-6_513

Chris Drummond¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1266))

Included in the following conference series:

International Conference on Case-Based Reasoning

1125 Accesses
2 Citations

Abstract

This paper demonstrates the exploitation of certain vision processing techniques to index into a case base of surfaces. The surfaces are the result of reinforcement learning and represent the optimum choice of actions to achieve some goal from anywhere in the state space. This paper shows how strong features that occur in the interaction of the system with its environment can be detected early in the learning process. Such features allow the system to identify when an identical, or very similar, task has been solved previously and to retrieve the relevant surface. This results in an orders of magnitude increase in learning rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agnar Aamodt and Enric Plaza (1994) Case-Based Reasoning: Foundational Issues, Methodological Variations, and System Approaches. AlCom — Artificial Intelligence Communications V. 7 No. 1 pp 39–37
Google Scholar
Aha, D. W., and Salzberg, S. L. (1993). Learning to catch: Applying nearest neighbor algorithms to dynamic control tasks. Proc. Fourth International Workshop on Artificial Intelligence and Statistics, pp 363–368
Google Scholar
C. H. Chin and C. R. Dyer (1986) Model-based recognition in Robot Vision. Computing surveys V. 18 No 1 pp 67–108
Google Scholar
Kristian J. Hammond (1990) Case-Based Planning: A Framework for Planning from Experience. The Journal of Cognitive Science V. 14 no. 3
Google Scholar
Jean-Yves Herve and Rajeev Sharma And Peter Cucka (1991) The Geometry of Visual Coordination. Proc. Ninth National Conf. on Artificial Intelligence pp 732–737
Google Scholar
Frederic Leymarie and Martin D. Levine. (1993) Tracking Deformable Objects in the Plane Using an Active Contour Model. IEEE Trans. Pattern Analysis And Machine Intelligence V. 15 No. 6 pp 617–634
Google Scholar
R. A. McCallum, (1995). Instance-based utile distinctions for reinforcement learning. Proc. Twelfth International Conf. on Machine Learning, pp 387–395
Google Scholar
R. A. McCallum (1995). Instance-based state identification for reinforcement learning. Advances in Neural Information Processing Systems 7. pp 377–384
Google Scholar
Staphane Mallat and Sifen Zhong (1992). Characterization of Signals from Multi-scale Edges. IEEE Trans. Pattern Analysis And Machine Intelligence V. 14 No. 7 pp 710–732
Google Scholar
David Marr (1982) Vision: a computational investigation into the human repre-sentation and processing of visual information. W.H. Freeman
Google Scholar
A. W. Moore and C. G. Atkeson (1993) Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time. Machine Learning, V. 13 pp 103–130
Google Scholar
Jing Peng (1995) Efficient Memory-Based Dynamic Programming. Proc. Twelfth International Conf. of Machine Learning pp 438–446
Google Scholar
John W. Sheppard and Steven L. Salzberg (1996) A teaching strategy for memory-based control. To appear in AI Review, special issue on Lazy Learning.
Google Scholar
P. Suetens and P. Fua and A. Hanson (1992) Computational strategies for object recognition. Computing surveys V. 24 No. 1 pp 5–61
Google Scholar
R.S. Sutton (1988) Learning to Predict by the Methods of Temporal Differences. Machine Learning V. 3 pp 9–44
Google Scholar
R.S. Sutton (1990) Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. Proc. Seventh International Conf. on Machine Learning pp 216–224
Google Scholar
P. Tadepalli and D. Ok (1996) Scaling up Average Reward Reinforcement Learning by Approximating the Domain Models and the Value Function. Proc. Thirteenth International Conf. of Machine Learning pp 471–479
Google Scholar
Manuela M. Veloso and Jaime G. Carbonell (1993) Derivational Analogy in PRODIGY: Automating Case Acquisition, storage and Utilization. Machine Learning V. 10 No. 3 pp 249–278
Google Scholar
Christopher J.C.H. Watkins and Peter Dayan (1992) Technical Note:Q-learning Machine Learning V. 8 No 3–4 pp 279–292
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Ottawa, KIN 6N5, Ottawa, Ontario, Canada
Chris Drummond

Authors

Chris Drummond
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

David B. Leake Enric Plaza

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Drummond, C. (1997). Using a case base of surfaces to speed-up reinforcement learning. In: Leake, D.B., Plaza, E. (eds) Case-Based Reasoning Research and Development. ICCBR 1997. Lecture Notes in Computer Science, vol 1266. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63233-6_513

Download citation

DOI: https://doi.org/10.1007/3-540-63233-6_513
Published: 08 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63233-7
Online ISBN: 978-3-540-69238-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics