Confusion regarding $m$-step transition function

36 Views Asked by At

The following is from my class lecture notes (handwritten) on Markov chain. I am having trouble understanding some parts of it. Trying to summarize what is written:

The $m$-step transition function: $P^{m}(x,y)$:

The probability of going from $x$ to $y$ in $m$-steps is:

$$P^{m}(x,y) = \sum_{y_1}...\sum_{y_{m-1}}P(x,y_1)P(y_1,y_2)...P(y_{m-2},y_{m-1})P(y_{m-1},y)$$

$P^{1}(x,y) = P(x,y)$

$P^{0}(x,y) = \begin{cases} 1, & x = y \\ 0, & x\neq y \end{cases}$

is $A_0,A_1,...,A_{n-1} \subseteq F$ ($F$ being the state space)

then

$$P(X_{n+1} = y_1,...,X_{n+m}=y_m|X_0\in A_0,X_1\in A_1,...,X_{n-1}\in A_{n-1}, X_n=x) = P(x,y_1)P(y_1,y_2)...P(y_{m-1},y_m)$$

$$\therefore P(X_{n+m}=y|X_0\in A_0,X_1 \in A_1, ..., X_{n-1}\in A_{n-1}, X_n=x)=P^{m}(x,y)$$

$$P(X_{n+m}=y|X_0=x,X_n=z)=P^{m}(z,y)$$

$$\therefore P^{n+m}(x,y) = P(X_{n+m}=y|X_0=x)$$

$$ = \sum_{z}P(X_n=z|X_0=x)P(X_{n+m} = y | X_0 = x, X_n=z)$$

Questions:

  1. Why are we summing over $y_1,y_2,...,y_m$ in the expression for $P^{m}(x,y)$? Aren't $x,y,y_1,..,y_{m-1}$ just "states"? What does summing over $y_1,y_2,...$ even mean in this context?

    According to me the probability of going from a state $x$ to $y$ in $m$ steps should have just been $P(x,y_1)P(y_1,y_2)...P(y_{m-2},y_{m-1})P(y_{m-1},y)$ if $y_1,y_2,..$ were the intermediate steps.

  2. Why is $P(X_{n+m}=y|X_0\in A_0,X_1 \in A_1, ..., X_{n-1}\in A_{n-1}, X_n=x)=P^{m}(x,y)$? What does it mean? I'm not even sure what $A_1,A_2,..$ stand for here.

  3. Why is $P(X_{n+m}=y|X_0=x,X_n=z)=P^{m}(z,y)$ ?

1

There are 1 best solutions below

5
On BEST ANSWER
  1. The $m-1$ intermediate steps could be anything, therefore you have to sum over all possibilities, which is equivalent to summing over all $m-1$ tuples of states.
  2. The probability to get from a given state to another in $m$ steps is independent of what the previous steps were. That is the essential property that makes this a Markov chain, and it comes from the very definition of the process. The $A_i$ are some subsets. It doesn't really matter, it could as well be written "some information regarding $X_i$, $i<n$" instead of "$X_i\in A_i$".
  3. Same principle as in 2., the information regarding $X_0$ is irrelevant, and the probability to get from a given state to another in $m$ steps does not depend on the index (value of $n$ in $X_n$) of the initial step considered.