Find the expected frequency of some state in a state sequence of length N given a transition matrix M

191 Views Asked by At

I can represent stochastically-articulated sequences of states using a transition matrix M where a given entry in cell (i,j) corresponds to the probability of state j given that the current (or, most recently settled on) state is i.

For example, a coin has two possible states that are equiprobable no matter what the current state is. Its transition matrix has two rows and two columns, with every cell (i,j) containing the number .5. The expected frequency of Heads in a sequence defined by this transition matrix (where the first state is chosen randomly) is .5 * n where n is the length of the sequence.

When the rows of a transition matrix are all equal, it seems that for a given state you can derive the factor to be multiplied by n by just averaging the values of its corresponding column. Even with a unfair coin where heads has a probability of .6 (making the coin's 2x2 transition matrix's rows equal [.6 .4]), this method accurately expects a produced sequence to be 60% heads and 40% tails.

But suppose what produces the Heads and Tails isn't a coin, and that it has a transition matrix with first row [.5 .5] and second row [.4 .6]. My method described before expects 45% heads and 55% tails, but simply simulating the sequence production mechanism for a few states suggests that this isn't quite correct.

After the first state is decided randomly, if we don't presume that it has settled on any value, I at least believe we find that the probability of a Heads in the next state is (.5 * .5 + .5 * .4) = .45. But the probability of Heads on the next state without presuming anything about the sequence's past is (.5*.5*.5 + .5*.5*.4 + .5*.4*.5 + .5*.6*.4) = .445. And so I suspect that because of the bias for Tails when already at Tails, a smaller proportion of heads can be expected than just .45.

My problem is a bit more abstract than this particular one - I handle transition matrices for state spaces much greater than 2 (up to 50, actually). But my hope is that there is some formula or algorithm that will allow me to calculate the expected frequency of any state in a sequence of length n given a transition matrix that guides its articulation. Can you help me find it?

2

There are 2 best solutions below

3
On

The probability of state i after n jumps can only be defined depending on what your initial probability vector is. Suppose you let u be your initial vector. It is simply a probability distribution of time 0, where you choose randomly. If you need to find the probablity distribution after n steps, you simply multiply u by your transition matrix raised to the nth power. I.e.:

Suppose $$u= (.5,.5)$$ corresponding to the state space $$(H,T)$$ $$P(X=k)=u*M^n$$

Note that this outputs a one by two vector (because M is a 2 by 2 matrix), which is a probability distribution under the state space (H,T).

Basically, you need to know the intitial distribution.

0
On

For "short" sequences, in general you need the initial distribution to say anything.

For "long" sequences, the "generic" situation is that there is a unique invariant distribution, to which the long-term distribution of the process converges. (When I say "generic" I mean that I am omitting some assumptions which hold for "most" transition matrices.) This is a kind of "ergodic theorem" for Markov chains with a finite number of states. No particularly heavy machinery is needed for the proof, you just need the Perron-Frobenius theorem from linear algebra.

Looking at your example, the invariant distribution of the chain with matrix $\begin{bmatrix} 0.5 & 0.5 \\ 0.4 & 0.6 \end{bmatrix}$ is the solution to the eigenvector problem

$$\begin{bmatrix} p_1 & p_2 \end{bmatrix} \begin{bmatrix} 0.5 & 0.5 \\ 0.4 & 0.6 \end{bmatrix} = \begin{bmatrix} p_1 & p_2 \end{bmatrix}.$$

This can be recast as a linear system:

$$\begin{bmatrix} p_1 & p_2 \end{bmatrix} \begin{bmatrix} -0.5 & 0.5 \\ 0.4 & -0.4 \end{bmatrix} = 0.$$

This reads $0.5p_1=0.4p_2$, so $p_1=0.8p_2$. To be a distribution we need $p_1+p_2=1$. Combining these we get $p_1=\frac{0.8}{1.8}=0.4444\dots$ and $p_2=\frac{1}{1.8}=0.5555\dots$. This is the long term distribution of your process.

So your guess is incorrect. You can see this more easily in a more extreme example like

$$\begin{bmatrix} 0.5 & 0.5 \\ 0.01 & 0.99 \end{bmatrix}.$$

Once this chain gets to state 2, it will usually stay there for a long time, on average 100 steps. Then it will leave, and will on average only take 2 steps to get back to state 2, after which it will again stay for a long time, etc. So you would expect the fraction of $2$s in the sequence to be something like 98%, much larger than the approximately 75% that your guess gives.