A Theoretical Comparison of Models
We have seen in the previous chapter that Markov Decision Processes can be consid- ered an “ideal” approach to the implementation of intelligent agents. Even though assigning utilities to states and probabilities to transitions between states might be regarded as a questionable way to solve the problem of preference, there are many situations in which this is acceptable. Once we have accepted that the problem is cor- rectly formulated in terms of the probabilities of actions having particular effects, and certain states having higher rewards than others, the MDP solution algorithms yield MEU-optimal policies. By this we mean mappings of states into actions that tell the agent what to do in each state, based on the probable outcomes of every possible action.
KeywordsState Space Action Space Markov Decision Process Belief State Reward Function
Unable to display preview. Download preview PDF.