Learning to balance a pole on a movable cart through RL: what can be gained using Adaptive NN?

Borghese, N. A.; Biemmi, C.; Monaco, F. Lo

doi:10.1007/978-1-4471-0811-5_16

N. A. Borghese⁴,
C. Biemmi⁴ &
F. Lo Monaco⁴

Part of the book series: Perspectives in Neural Computing ((PERSPECT.NEURAL))

136 Accesses

Abstract

The work of Barto, Sutton and Williams on the ACE/ASE model for Reinforcement Learning is here put in perspective. In their work a state-control (input-output) map, which allows to balance a pole hinged on a moving cart as long as possible, is learned when the only information provided by the environment is the system failure. This work has given rise to a large body of research in the fields of machine learning and artificial intelligence. Its relevance lies in the fact that it can be applied to control all those systems which are only partially known. A critical issue is the exploration of the state space which may require impractical amount of memory and learning time. Adaptive networks, which have been studied in the most recent years, offer a natural solution in the implementation of the learning system allowing an adaptive partitioning of the state space according to the task difficulty experienced in the different regions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L.P. Kaelbling, M.L. Littman and A.W. Moore, Reinforcement Learning: A Survey, J. Artificial Intelligence Research 4, (1996) 237–285.
Google Scholar
R.J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning, Machine Learning, 8:3, (1992), 29–256.
Google Scholar
R.S. Sutton, Learning to predict by the method of temporal differences, Machine Learning, 3:1, (1988), 9–44.
Google Scholar
W.T. Miller III, R.S. Sutton and P.J. Werbos (Eds.), Neural Networks for Control, (1992) MIT Press.
Google Scholar
A.G. Barto, R.S. Sutton, and C.W. Anderson, Neuronlike adaptive elements that can solve difficult learning problems. IEEE Trans. Syst. Man and Cybern. SMC-13, (1983), 834–846. u]6._D. Michie and R.A. Chambers, “BOXES: An experiment in adaptive control”, Machine Intelligence, 2, E. Dale and D. Michies, Eds. Edinburgh: Oliver and Boyd, (1968), 237–152.
Google Scholar
B. Widrow, N.K. Gupta and S. Maitra, Punish/reward: learning with a critic in adaptive threshold systems, IEEE Trans. Syst. Man Cybern. SMC-3, (1973), 455–465.
Article MathSciNet Google Scholar
M.I. Jordan and D.E. Rumelhart, Forward models: Supervised learning with a distal teacher. Cognitive Science, 16, (1992), 307–354.
Article Google Scholar
N.A. Borghese and M. Arbib, Generation of Temporal Sequences using Local Dynamic Programming, Neural Networks 8:1, (1995), 39–54.
Article Google Scholar
C.W. Anderson, Learning to control an inverted pendulum using neural networks. IEEE Control Systems Magazine (1989) 31–37.
Google Scholar
H.A. Berenji and P. Khedkar, Learning and tuning fuzzy logic controllers through reinforcement. IEEE Trans. on Neural Networks, 3:5 (1992), 724–740.
Article Google Scholar
N.A. Borghese and S. Ferrari, Hierarchical RBF networks in function approximation. Neuro Computing, (1998).
Google Scholar
M. Cannon and J.E. Slotine, Space-Frequency localized basis function networks for nonlinear system estimation and control, Neurocomputing 9 (1995) 293–342.
Article MATH Google Scholar
B. Fritzke, Growing Cell Structures: A Self-organizing Network for Unsupervised and Supervised Learning, Neural Networks 7 (1994) 1441–1460.
Article Google Scholar
T.M. Martinetz, S.G. Berkovich and K.J. Schulten, Neural-Gas” Network for Vector Quantization and its Application to Time-Series Prediction, IEEE Trans. Neural Networks 4:4 (1993) 558–568.
Article Google Scholar
A. Baraldi and E. Alpayd, Simplified ART: A new class of ART algorithms, ICSI TR-98-004, (1998), Berkeley CA.
Google Scholar
A.W. Moore, Variable resolution dynamic programming: Efficiently learning action maps in multivariate real-valued space. In Proc. Eight International Machine Learning Workshop, (1991).
Google Scholar
V. Gullapalli, A stochastic reinforcement learning algorithm for learning real-valued functions, Neural Networks 3, (1990) 671–692.
Article Google Scholar
G. Tesauro, TD-Gammon, a self-teaching backgammon program which achieves master-level play. Neural Computation, 6(2), (1994) 215–219.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Laboratory of Human Motion Study and Virtual Reality Istituto Neuroscienze e Bioimmagini, CNR, Via f.lli Cervi 83, 20090, Segrate (Milano), Italy
N. A. Borghese, C. Biemmi & F. Lo Monaco

Authors

N. A. Borghese
View author publications
You can also search for this author in PubMed Google Scholar
C. Biemmi
View author publications
You can also search for this author in PubMed Google Scholar
F. Lo Monaco
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dipartimento di Scienze Fisiche “E.R. Caianiello”, Università di Salerno, 84081, Baronissi (SA), Italy
Maria Marinaro
Dipartimento di Informatica ed Applicazioni “R.M. Capocelli”, Università di Salerno, 84081, Baronissi (SA), Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borghese, N.A., Biemmi, C., Monaco, F.L. (1999). Learning to balance a pole on a movable cart through RL: what can be gained using Adaptive NN?. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets WIRN VIETRI-98. Perspectives in Neural Computing. Springer, London. https://doi.org/10.1007/978-1-4471-0811-5_16

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0811-5_16
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1208-2
Online ISBN: 978-1-4471-0811-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics