Homeokinetic Reinforcement Learning
In order to find a control policy for an autonomous robot by reinforcement learning, the utility of a behaviour can be revealed locally through a modulation of the motor command by probing actions. For robots with many degrees of freedom, this type of exploration becomes inefficient such that it is an interesting option to use an auxiliary controller for the selection of promising probing actions. We suggest here to optimise the exploratory modulation by a self-organising controller. The approach is illustrated by two control tasks, namely swing-up of a pendulum and walking in a simulated hexapod. The results imply that the homeokinetic approach is beneficial for high complexity problems.
KeywordsUtility Function Reinforcement Learning Control Task Motor Command Reward Function
Unable to display preview. Download preview PDF.
- 3.Der, R., Michael Herrmann, J., Liebscher, R.: Homeokinetic approach to autonomous learning in mobile robots. VDI-Berichte, vol. 1679, pp. 301–306 (2002)Google Scholar
- 4.Der, R., Liebscher, R.: True autonomy from self-organized adaptivity. In: Workshop Biologically Inspired Robotics, Bristol (2002)Google Scholar
- 8.Martius, G.: Goal-Oriented Control of Self-Organizing Behavior in Autonomous Robots. PhD thesis, Göttingen University (2010)Google Scholar
- 9.Martius, G., Herrmann, J.M.: Tipping the scales: Guidance and intrinsically motivated behavior. In: Proc. of Europ. Conf. on Artificial Life (2011)Google Scholar
- 11.Martius, G., Hesse, F., Güttler, F., Der, R.: Lpzrobots: A free and powerful robot simulator (2011), robot.informatik.uni-leipzig.de
- 12.Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44 (1988)Google Scholar
- 13.Sutton, R.S., Barto, A.G.: Reinforcement learning: An introduction. MIT Press, Cambridge (1998); A Bradford BookGoogle Scholar
- 14.Wiener, N.: Cybernetics or Control and Communication in the Animal and the Machine. Hermann, Paris (1948)Google Scholar