Control Optimization with Learning Automata

  • Abhijit Gosavi
In this chapter, we will discuss an alternative to Reinforcement Learning for solving Markov decision problems (MDPs) and Semi-Markov decision problems (SMDPs). The methodology that we will discuss in this chapter is generally referred to as Learning Automata. We have already discussed the theory of learning automata in the context of parametric optimization. It turns out that in control optimization too, in particular for solving problems modeled with Markov chains, learning automata methods can be useful.


