Background:
Let $X=(X_0,X_1,X_2,\ldots)$ be a sequence of random variables taking values in $I$ (the state space). The process $X$ is called a Markov chain if for any $n \geq 0$ and any $i_0,i_1,\ldots,i_{n+1} \in I$, $$\mathbb{P}(X_{n+1} = i_{n+1} \mid X_n = i_n, \ldots, X_0 = i_0) = \mathbb{P}(X_{n+1} = i_{n+1} \mid X_n = i_n)$$ The Markov chain is assumed to be homogeneous, i.e. the transition probabilities $p_{ij} := \mathbb{P}(X_{n+1} = j \mid X_n = i)$ depend only on $i$ and $j$, not on $n$.
The initial distribution $\lambda$ of $X_0$ is given by $\lambda_i = \mathbb{P}(X_0 = i)$ for each $i \in I$.
Problem:
I am trying to prove the Markov property, which is stated as: \begin{align} &\mathbb{P}(X_{n+1} \in A_{n+1}, \ldots, X_{n+m} \in A_{n+m} \mid X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i)\\ &= \mathbb{P}(X_{n+1} \in A_{n+1}, \ldots, X_{n+m} \in A_{n+m} \mid X_n = i)\\ &= \mathbb{P}(X_1 \in A_{n+1}, \ldots, X_m \in A_{n+m} \mid X_0 = i) \end{align} for all $A_0,\ldots,A_{m+n} \subseteq I$ with $\mathbb{P}(X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i) > 0$.
We're told that to prove this, we should use the following:
The case when $A_k = \{i_k\}$ for all $k$ follows from a previous theorem, which states that $$\mathbb{P}(X_0 = i_0, X_1 = i_1,\ldots,X_n = i_n) = \lambda_{i_0}p_{i_0 i_1}p_{i_1 i_2} \ldots p_{i_{n-1} i_{n}}$$
For general $A_k$, $k \geq n+1$, sum over $i_k \in A_k$, $k \geq n+1$. The case for general $A_k$, $k \leq n$ can be obtained intuitively from the following observation: If $\mathbb{P}(E\mid F_1)=\mathbb{P}(E\mid F_2)=\mathbb{P}(E\mid G)$ for disjoint $F_1,F_2 \subseteq G$, then $\mathbb{P}(E\mid F_1 \cup F_2) = \mathbb{P}(E\mid G)$.
I've managed to show case 1, but I'm stuck on case 2.
Progress:
Case 1.
If $A_k = \{i_k\}$ for all $k$, then we have \begin{align} &\mathbb{P}(X_{n+1} = i_{n+1}, \ldots, X_{n+m} = i_{n+m} \mid X_0 = i_0,\ldots,X_{n-1} = i_{n-1},X_n = i)\\ &= \frac{\mathbb{P}(X_0 = i_0, \ldots, X_{n+m} = i_{n+m})}{\mathbb{P}(X_0 = i_0, \ldots, X_n = i)}\\ &= \frac{\lambda_{i_0}p_{i_0 i_1}p_{i_1 i_2} \ldots p_{i_{n+m-1} i_{n+m}}}{\lambda_{i_0}p_{i_0 i_1}p_{i_1 i_2} \ldots p_{i_{n-1} i}}\\ &= p_{i i_{n+1}} \ldots p_{i_{n+m-1} i_{n+m}}\\ &= \frac{\mathbb{P}(X_n = i)p_{i i_{n+1}} \ldots p_{i_{n+m-1} i_{n+m}}}{\mathbb{P}(X_n = i)}\\ &= \mathbb{P}(X_{n+1} = i_{n+1}, \ldots, X_{n+m} = i_{n+m} \mid X_n = i) \end{align} and also, by homogeneity, \begin{align} & p_{i i_{n+1}} \ldots p_{i_{n+m-1} i_{n+m}}\\ &= \frac{\lambda_i p_{i i_{n+1}} \ldots p_{i_{n+m-1} i_{n+m}}}{\lambda_i}\\ &= \mathbb{P}(X_1 = i_{n+1}, \ldots, X_m = i_{n+m} \mid X_0 = i) \end{align}
Case 2. \begin{align} &\mathbb{P}(X_{n+1} \in A_{n+1}, \ldots, X_{n+m} \in A_{n+m} \mid X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i)\\ &= {\small \mathbb{P}\left(\left(\bigcup_{i_{n+1} \in A_{n+1}} \{X_{n+1} = i_{n+1}\}\right) \cap \ldots \cap \left(\bigcup_{i_{n+m} \in A_{n+m}} \{X_{n+m} = i_{n+m}\}\right) \mid X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i\right)}\\ &= {\small \mathbb{P}\left(\bigcup_{i_{n+1} \in A_{n+1}} \ldots \bigcup_{i_{n+m} \in A_{n+m}} (\{X_{n+1} = i_{n+1}\} \cap \ldots \cap \{X_{n+m} = i_{n+m}\}) \mid X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i\right)}\\ &= {\small \sum_{i_{n+1} \in A_{n+1}} \ldots \sum_{i_{n+m} \in A_{n+m}} \mathbb{P}(X_{n+1} = i_{n+1}, \ldots, X_{n+m} = i_{n+m} \mid X_0 \in A_0,\ldots,X_{n-1} \in A_{n-1},X_n = i)} \end{align} I'm not sure how to use the hint to make progress from here.
You should apply the hint for Case 2 in two steps:
Step A. First prove the result when $A_n, A_{n+1}, \ldots, A_{n+m}$ are arbitrary non-singleton sets, and $A_0=\{i_0\}, A_1=\{i_1\}, \ldots, A_{n-1}=\{i_{n-1}\}$ are singletons. The work that you've done so far for Case 2 certainly applies to this situation, but now you can apply Case 1.
Step B. Now let $A_0, A_1,\ldots, A_{n-1}$ be arbitrary non-singleton sets and apply the "observation" with $E:=\{X_{n+1}\in A_{n+1}, X_{n+2}\in A_{n+2},\ldots,X_{n+m}\in A_{n+m}\}$ and a finite collection of events (not just two events) of the form $F:=\{A_0=i_0, A_1=i_1, \ldots, A_{n-1}=i_{n-1}\}$. Step A tells us that $P(E\mid F)$ is the same value for every $F$ in your collection.