Global Versus Local Constructive Function Approximation for On-Line Reinforcement Learning

Vamplew, Peter; Ollington, Robert

doi:10.1007/11589990_14

Peter Vamplew²⁰ &
Robert Ollington²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3809))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2431 Accesses
7 Citations

Abstract

In order to scale to large state-spaces, reinforcement learning (RL) algorithms need to apply function approximation techniques. Research on function approximation for RL has so far focused either on global methods with a static structure or on constructive architectures using locally responsive units. The former, whilst achieving some notable successes, has also failed on some relatively simple tasks. The locally constructive approach is more stable, but may scale poorly to higher-dimensional inputs. This paper examines two globally constructive algorithms based on the Cascor supervised-learning algorithm. These algorithms are applied within the sarsa RL algorithm, and their performance compared against a multi-layer perceptron and a locally constructive algorithm (the Resource Allocating Network). It is shown that the globally constructive algorithms are less stable, but that on some tasks they achieve similar performance to the RAN, whilst generating more compact solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)
Google Scholar
Crites, R.H., Barto, A.G.: Improving Elevator Performance Using Reinforcement Learning, NIPS-8 (1996)
Google Scholar
Tesauro, G.J.: Temporal difference learning and TD-Gammon. Communications of the ACM 38(3), 58–68 (1995)
Article Google Scholar
Sutton, R.S.: Generalisation in reinforcement learning: Successful examples using sparse coarse coding. In: Touretzky, D.S., Mozer, M.C., Hasselmo, M.E. (eds.) Advances in Neural Information Processing Systems: Proceedings of the 1995 Conference, pp. 1038–1044. The MIT Press, Cambridge (1996)
Google Scholar
Kretchmar, R.M., Anderson, C.W.: Comparison of CMACs and RBFs for local function approximators in reinforcement learning. IEEE International Conference on Neural Networks (1997)
Google Scholar
Coulom, R.: Feedforward Neural Networks in Reinforcement Learning Applied to High-dimensional Motor Control. In: ALT 2002. Springer, Heidelberg (2002)
Google Scholar
Thrun, S., Schwartz, A.: Issues in Using Function Approximation for Reinforcement Learning. In: Proceedings of the Fourth Connectionist Models Summer School, Hillsdale, NJ (December 1993)
Google Scholar
Anderson, C.W.: Q-learning with hidden unit restarting. Advances in Neural Information Processing Systems (1993)
Google Scholar
Ratitch, B., Precup, D.: Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning. In: ECML (2004)
Google Scholar
Fahlman, S.E., Lebiere, C.: The Cascade-Correlation Learning Architecture. In: Touretzky, D.S. (ed.) Advances in Neural Information Processing II. Morgan Kauffman, San Francisco (1990)
Google Scholar
Waugh, S.G.: Extending and benchmarking Cascade-Correlation, PhD thesis, Department of Computer Science, University of Tasmania (1995)
Google Scholar
Rivest, F., Precup, D.: Combining TD-learning with Cascade-correlation Networks. In: Twentieth International Conference on Machine Learning, Washington DC (2003)
Google Scholar
Bellemare, M.G., Precup, D., Rivest, F.: Reinforcement Learning Using Cascade-Correlation Neural Networks, Technical Report RL-3.04, McGill University (2004)
Google Scholar
Platt, J.: A Resource-Allocating Network for Function Interpolation. Neural Computation 3, 213–225 (1991)
Article MathSciNet Google Scholar
Rummery, G., Niranjan, M.: On-line Q-Learning Using Connectionist Systems. Cambridge, Cambridge University Engineering Department (1994)
Google Scholar
Adams, A., Waugh, S.: Function Evaluation and the Cascade-Correlation Architecture. In: IEEE International Conference on Neural Networks, pp. 942–946 (1995)
Google Scholar
Hwang, J.-H., You, S.-S., et al.: The Cascade-Correlation Learning: A Projection Pursuit Learning Perspective. IEEE Transactions on Neural Networks 7(2), 278–288 (1996)
Article Google Scholar
Prechelt, L.: Investigation of the CasCor Family of Learning Algorithms. Neural Networks 10(5), 885–896 (1997)
Article Google Scholar
Lahnajarvi, J.J.T., Lehtokangas, M.I., Saarinen, J.P.P.: Evaluation of constructive neural networks with cascaded architectures. Neurocomputing 48, 573–607 (2002)
Article Google Scholar
Fahlman, S.E.: Faster-Learning Variations on Back-Propagation: An Empirical Study. In: Proceedings of the 1988 Connectionist Models Summer School (1988)
Google Scholar
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function, NIPS-7 (1995)
Google Scholar
Adams, A., Vamplew, P.: Encoding and Decoding Cyclic Data. The South Pacific Journal of Natural Science 16, 54–58 (1998)
Google Scholar
Sutton, R., Barto, S.: Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Tasmania, Private Bag 100, Hobart, Tasmania, 7001, Australia
Peter Vamplew & Robert Ollington

Authors

Peter Vamplew
View author publications
You can also search for this author in PubMed Google Scholar
Robert Ollington
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Guangxi Normal University, College of CS and IT, Guilin, China, and University of Technology, Faculty of Engineering and Information Technology, Sydney, Australia
Shichao Zhang
Department of Electrical and Computer Systems Engineering, Monash University, 3800, Melbourne, Victoria, Australia
Ray Jarvis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vamplew, P., Ollington, R. (2005). Global Versus Local Constructive Function Approximation for On-Line Reinforcement Learning. In: Zhang, S., Jarvis, R. (eds) AI 2005: Advances in Artificial Intelligence. AI 2005. Lecture Notes in Computer Science(), vol 3809. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11589990_14

Download citation

DOI: https://doi.org/10.1007/11589990_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30462-3
Online ISBN: 978-3-540-31652-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics