Coding the Environment and MDP Solution
- 3.7k Downloads
In this chapter, we will learn one of the most critical skills of coding our environment for any Reinforcement Learning agent to train against. We will create an environment for the grid-world problem such that it is compatible with OpenAI Gym’s environment such that most out-of-box agents could also work on our environment. Next, we will implement the value iteration and the policy iteration algorithm in code and make them work with our environment.