Markov decision processes
Many reinforcement learning (RL) algorithms are defined on Markov decision processes (or MDPs). More precisely, the problem that the RL agent is trying to solve is often formulated as an MDP. In this mini-post, I try to answer the following questions.
What are these MDPs?