I am not sure how to handle this problem where there are two types of cost: ( a ) cost of remaining in a state ( state_from = state_to ) ( b ) cost of transitioning from one state to another (state_from != state_to)
for example: I have 3 states, A,B,C
The transition matrix, T, for the chain is:
A B C
A 0.5 0.3 0.2
B 0.3 0.5 0.2
C 0.0 0.0 1.0
Here, probability of transitioning from A to C is 0.2
and the transition cost matrix, C, is as follows where cost of remaining in same state is 0 while transitioning to neighboring is 5 units and to the second neighbor is 10 units
A B C
A 0 5 10
B 5 0 5
C 10 5 0
If the chain starts from A i.e initial state vector is [1, 0, 0], and goes on for n steps, how to get the expected sum of the cost? Specifically,
- what is the cost involved in first transition
- what is the cost involved in nth transition
- what is the total cost involved after all n transitions
A key step toward solving this problem is to transfer the transition costs to the states. You can work with either the expected cost to leave a state or the expected one-step cost to enter a state. I’ll do the former. If $X_k=i$, then the expected cost of the next step is simply $\sum_{j=1}^3p_{ij}c_{ij}$. If you compute this for all three states, you get a vector $\mathbf r$ of expected exit costs. (I use $\mathbf r$ because the literature usually talks about “rewards” instead of costs.) Now, applying the law of total probability and linearity of expection, if $\mathbf\pi_k$ is the state distribution at time $k$, the expected exit cost for step $k+1$ is $\mathbf\pi_k\mathbf r = \mathbf\pi_0T^{k+1}\mathbf r$, and the total expected cost after $n$ steps is $$\sum_{k=0}^{n-1}\mathbf\pi_k\mathbf r = \left(\sum_{k=0}^{n-1}\mathbf\pi_k\right)\mathbf r = \mathbf\pi_0\left(\sum_{k=0}^{n-1}T^k\right)\mathbf r.$$ Unfortunately, $1$ is an eigenvalue of $T$, so you can’t simply compute the sum on the right as $(I-T)^{-1}(I-T^n)$.
For the system in your question, we have $\mathbf r=(3.5,2.5,0)^T$, so the expected cost of the first transition is $(1,0,0)\mathbf r = 3.5$. For the expected cost of the tenth transition I (well, really Mathematica) get approximately $0.4027$ and the total expected cost up through that time is about $14.014$. If you’re getting different values, check for off-by-one errors in the powers of $T$ that you’re using.