I am solving a problem with Markov processes. We have an MDP with state set $S$, action set $A$, reward $r(s,a)$, and transition probabilities $p(s',s,a)$ that gives the probability of going to state $s'$ from state $s$ when the chosen action is $a$.
For some period $t$, given an action $a_t\in A$, then I know to what state $s_{t+1}$ I am going to transit, given that I am in state $s_t$. That is I know the value of $p(s_{t+1},s_t,a_t)$ with certainty.
The only problem is that the reward $r(s_t,a_t)$ is uncertain and even non-stationary.
In this still an MDP? Since I am new to MDPs, can you refer me to some relevant work with similar assumptions on the transition probabilities?