Abstract
On reinforcement learning with limited exploration, an agent’s policy tends to fall into a worthless local optimum. This paper proposes Observational Reinforcement Learning method with which the learning agent evaluates inexperienced policies and reinforces it. This method provides the agent more chances to escape from a local optimum without exploration. Moreover, this paper shows the effectiveness of the method from experiments in the RoboCup positioning problem. They are advanced experiments described in our RoboCup-97 paper [1].
This work was mainly done when the author was in Dept. of Mathematical and Computing Sciences, Tokyo Institute of Technology.
Chapter PDF
Similar content being viewed by others
References
Andou, T.: “Refinement of Soccer Agents’ Positions Using Reinforcement Learning”, In RoboCup-97: Robot Soccer World Cup I, pp.373–388 (1998).
Kaelbling, L. P.: “Reinforcement Learning: A Survey”, Journal of Artificial Intelligence Research 4, pp.237–285 (1996).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Heidelberg Berlin
About this paper
Cite this paper
Andou, T. (1999). Andhill-98: A RoboCup Team which Reinforces Positioning with Observation. In: Asada, M., Kitano, H. (eds) RoboCup-98: Robot Soccer World Cup II. RoboCup 1998. Lecture Notes in Computer Science(), vol 1604. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48422-1_27
Download citation
DOI: https://doi.org/10.1007/3-540-48422-1_27
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66320-1
Online ISBN: 978-3-540-48422-6
eBook Packages: Springer Book Archive