Coding the Environment and MDP Solution

Coding the Environment, Value Iteration, and Policy Iteration Algorithms
  • Mohit SewakEmail author


In this chapter, we will learn one of the most critical skills of coding our environment for any Reinforcement Learning agent to train against. We will create an environment for the grid-world problem such that it is compatible with OpenAI Gym’s environment such that most out-of-box agents could also work on our environment. Next, we will implement the value iteration and the policy iteration algorithm in code and make them work with our environment.

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.PuneIndia

Personalised recommendations