An Enquiry Concerning the "reward function" for a Markov Chain

56 Views Asked by At

The Statement of the Problem:

Consider a Markov chain with state space

$$ S = \{1, 2, 3 \} $$

and probability transition matrix

$$ P = \left( \begin{matrix} .3 & .7 & 0 \\ .4 & .4 & .2 \\ 0 & .7 & .3 \\ \end{matrix} \right) $$

Find $$ \lim_{n \to \infty} \frac{1}{n} \sum_{k = 1}^{n}X(k) $$ where $X(k)$ is the state at time $k$.

The Solution:

Let $g(j)$ be the "reward function" for $j \in S$. Then, since this MC is irreducible,

$$ \lim_{n \to \infty} \frac{1}{n} \sum_{k = 1}^{n}g \left[X(k)\right] = \sum_{j \in S}g(j)\pi_j $$

where $\pi_j$ is the long-run proportion of time that the process will be in state $j$.

So, we have $\pi = \left(\frac{4}{13}, \frac{7}{13}, \frac{2}{13}\right)$. No problem. All I need to determine is $g(j)$ and sum the three $\pi$'s weighted by their rewards. Now, for some reason, the solution states (without qualification, so it must be "obvious") that

$$ g(j) = j . $$

Well, ok, that certainly makes the computation easy; but I can't, for the life of me, see why this is the case, i.e. that $g(j) = j$ (I suppose one could say that each state is a "fixed point" of $g$). Can someone please shine a light on this for me?