What's the difference between discounted cost , total expected cost and average expect cost MDP? Are they just MDP problems with different objective function? When the discounted factor equals 1, then discounted cost mdp becomes total cost? Can anyone provide more detailed explanation?
Another question, most of existing theory provides conditions for the existence of optimal and stationary policy for mdp with finite state and action space and mdp with countable state space. How to proof the existence of optimal and stationary policy for discrete time MDP with infinite and uncountable state problem?