Does a Markov Decision Process have to be random?

75 Views Asked by At

In reinforcement learning, a Markov Decision Process (MDP) is used to formalize the problem. On Wikipedia, you can read that an MDP is a "discrete-time stochastic control process." I take stochastic to mean that the transition dynamics are probabilistic, i.e. for a given state, action pair $s_t, a_t$, the next state $s_{t+1}$ is "chosen" according to some probability distribution, $$p(s_{t+1} | s_t, a_t).$$ But what happens if we have a reinforcement learning problem with deterministic transition dynamics, like chess? Can they not be modeled with an MDP if MDPs have to be stochastic by definition?