This seems a very basic question, but recently after coming across this question that what are the requirements for a Markov system to be optimal, I found myself to be confused. These are what comes to mind:
- It has to have a Markov strategy, i.e., Ut=gt(Xt), where t is for time
- State-space, action space, and the noise space all need to be Borel space
- The cost function (or the reward function) must be measurable.
Am I correct in identifying the requirements? Correct me if I am wrong - your input will be highly appreciated.