Is there any rigorous treatment of Markov Decision Process?

177 Views Asked by At

I am trying to find a mathematically rigorous introduction to MDP.

There are tons of resouces online but all of them are ... frankly terrible (and not even properly typesetted).

Just picking a few: top three links Google returns for "MDP tutorials"

https://engineering.purdue.edu/~givan/talks/mdp-tutorial.pdf

http://www.cs.cmu.edu/~./awm/tutorials/mdp09.pdf

https://hub.packtpub.com/reinforcement-learning-mdp-markov-decision-process-tutorial/

I don't know why, but they all read something like:

A MDP is a tuple $(S,A,R,P)$ (sometimes $(S,A,R,P,\pi)$, or $(S,A,R,P,\pi,\gamma)$ or $(S,A,R,P,\pi,\gamma, T)$) where $S$ is a set of states, $A$ is a set of action, $R$ is (depending on who the author is) a determinstic function, a set containing a finite set of values, or a probability distribution over the action and the states, $P$ is a conditional probability or sometimes conditional probability distribution.

In addition, almost all of the references use different notations (various subscripts and superscripts will be added) and have different assumptions and they are all called MDP. I just want one standard reference that eliminates all these ambiguities and conflicting definitions.