A way to check the accuracy of a Markov chain?

2.7k Views Asked by At

I am not sure whether I should post this question on MSE or SSE. I will post it here 1st to see if I can get some feedback.

Say I have a finite discrete Markov chain constructed maybe using some data set. Is there any good way of checking it's accuracy? Any model will be required to be tested or maybe back tested in order to validate it. So I need to know if there are any such methods for Markov Chains. Hope someone can help me out. Thanks.

1

There are 1 best solutions below

11
On BEST ANSWER

If you have enough data, then you can collect transition pairs (i to j) and do a chi-square test. If $N_i$ is the number of transition pairs starting at $i$, the each transition pair (i to j) should occur $N_ip_{ij}$ times. These are your cell expected values in the chi-square test. Note, you will need enough data so that $N_ip_{ij}>5\;\; \forall ij$

If $T$ is the number of transition probabilities, and $S$ the number of states, then your test will have $T-S$ degrees of freedom.

Example for OP

Lets say we have a simple $2 \times 2$ transition matrix $M$ for a two-sate Markov chain.

At each step, you observe the actual state, $S_t\in\{0,1\}$. Lets say that we observe the system for $T$ time steps. What we want to know is if $M$ is a plausible explanation for the dynamics of $S$.

To do this, we note that the transition matrix is really just a bunch of conditional probabilities:

$p_{i,j}=P(S_{t+1}=j|S_t=i)$ Therefore, the row-sums of $M$ will equal 1. This provides the basis for our chi-squared frequency table:

  1. First, count the number of times $S_t=1$ for $t<T$, call this $N_1$
  2. Second,count the number of times $S_t=2$ for $t<T$, call this $N_2$
  3. Note that $T=N_1+N_2 +1$, so we have accounted for all the states except the last one, $S_T$
  4. Now, for the states that are in state $i$ count the number of times the next state was $j$, call this $N_{ij}$
  5. Form a $2 \times 2$ matrix $O$, where row 1 is the State 1 transitions, and row 2 is the State 2 transition.
  6. Let $O_{ij}=N_{ij}$ These are the observed frequencies of each of the four possible transitions.
  7. For the chi-square test, we also need an expected frequency. Create another $2 \times 2$ matrix $E$, where row 1 is the State 1 transitions, and row 2 is the State 2 transition.
  8. Let $E_{ij}=p_{ij}N_i$ You'll need enough data to ensure $E_{ij}>5\;\; \forall ij$
  9. You now have the observed and expected cell counts you need for the chi-square test. Just select your Type I error rate and you're set.