How to calculate cost in discrete Markov transitions

651 Views Asked by At

I am not sure how to handle this problem where there are two types of cost: ( a ) cost of remaining in a state ( state_from = state_to ) ( b ) cost of transitioning from one state to another (state_from != state_to)

for example: I have 3 states, A,B,C

The transition matrix, T, for the chain is:

      A     B     C
 A   0.5   0.3   0.2
 B   0.3   0.5   0.2
 C   0.0   0.0   1.0

Here, probability of transitioning from A to C is 0.2

and the transition cost matrix, C, is as follows where cost of remaining in same state is 0 while transitioning to neighboring is 5 units and to the second neighbor is 10 units

     A     B     C
 A   0     5     10
 B   5     0     5
 C   10    5     0

If the chain starts from A i.e initial state vector is [1, 0, 0], and goes on for n steps, how to get the expected sum of the cost? Specifically,

  • what is the cost involved in first transition
  • what is the cost involved in nth transition
  • what is the total cost involved after all n transitions
1

There are 1 best solutions below

6
On BEST ANSWER

A key step toward solving this problem is to transfer the transition costs to the states. You can work with either the expected cost to leave a state or the expected one-step cost to enter a state. I’ll do the former. If $X_k=i$, then the expected cost of the next step is simply $\sum_{j=1}^3p_{ij}c_{ij}$. If you compute this for all three states, you get a vector $\mathbf r$ of expected exit costs. (I use $\mathbf r$ because the literature usually talks about “rewards” instead of costs.) Now, applying the law of total probability and linearity of expection, if $\mathbf\pi_k$ is the state distribution at time $k$, the expected exit cost for step $k+1$ is $\mathbf\pi_k\mathbf r = \mathbf\pi_0T^{k+1}\mathbf r$, and the total expected cost after $n$ steps is $$\sum_{k=0}^{n-1}\mathbf\pi_k\mathbf r = \left(\sum_{k=0}^{n-1}\mathbf\pi_k\right)\mathbf r = \mathbf\pi_0\left(\sum_{k=0}^{n-1}T^k\right)\mathbf r.$$ Unfortunately, $1$ is an eigenvalue of $T$, so you can’t simply compute the sum on the right as $(I-T)^{-1}(I-T^n)$.

For the system in your question, we have $\mathbf r=(3.5,2.5,0)^T$, so the expected cost of the first transition is $(1,0,0)\mathbf r = 3.5$. For the expected cost of the tenth transition I (well, really Mathematica) get approximately $0.4027$ and the total expected cost up through that time is about $14.014$. If you’re getting different values, check for off-by-one errors in the powers of $T$ that you’re using.