Changing Not Just Analyzing: Control Theory and Reinforcement Learning

  • Alexander Paprotny
  • Michael Thess
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)


We give a short introduction to reinforcement learning. This includes basic concepts like Markov decision processes, policies, state-value and action-value functions, and the Bellman equation. We discuss solution methods like policy and value iteration methods, online methods like temporal-difference learning, and state fundamental convergence results.

It turns out that RL addresses the problems from Chap.  2. This shows that, in principle, RL is a suitable instrument for solving all of these problems.


Action-value Function Markov Decision Process Bellman Equation Policy Iteration Episodic Tasks 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [BT96]
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)zbMATHGoogle Scholar
  2. [Mun00]
    Munos, R.: A study of reinforcement learning in the continuous case by the means of viscosity solutions. Mach. Learn. 40, 265–299 (2000)CrossRefzbMATHGoogle Scholar
  3. [Pap10]
    Paprotny A.: Hierarchical methods for the solution of dynamic programming equations arising from optimal control problems related to recommendation. Diploma Thesis, TU Hamburg-Harburg (2010)Google Scholar
  4. [SB98]
    Sutton, R.S., Barto, A.G.: Reinforcement Learning. An Introduction. MIT Press, Cambridge/London (1998)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Alexander Paprotny
    • 1
  • Michael Thess
    • 2
  1. 1.Research and Developmentprudsys AGBerlinGermany
  2. 2.Research and Developmentprudsys AGChemnitzGermany

Personalised recommendations