Realtime Data Mining pp 15-40 | Cite as
Changing Not Just Analyzing: Control Theory and Reinforcement Learning
Chapter
First Online:
- 1.7k Downloads
Abstract
We give a short introduction to reinforcement learning. This includes basic concepts like Markov decision processes, policies, state-value and action-value functions, and the Bellman equation. We discuss solution methods like policy and value iteration methods, online methods like temporal-difference learning, and state fundamental convergence results.
It turns out that RL addresses the problems from Chap. 2. This shows that, in principle, RL is a suitable instrument for solving all of these problems.
Keywords
Action-value Function Markov Decision Process Bellman Equation Policy Iteration Episodic Tasks
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
- [BT96]Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)zbMATHGoogle Scholar
- [Mun00]Munos, R.: A study of reinforcement learning in the continuous case by the means of viscosity solutions. Mach. Learn. 40, 265–299 (2000)CrossRefzbMATHGoogle Scholar
- [Pap10]Paprotny A.: Hierarchical methods for the solution of dynamic programming equations arising from optimal control problems related to recommendation. Diploma Thesis, TU Hamburg-Harburg (2010)Google Scholar
- [SB98]Sutton, R.S., Barto, A.G.: Reinforcement Learning. An Introduction. MIT Press, Cambridge/London (1998)Google Scholar
Copyright information
© Springer International Publishing Switzerland 2013