Let $(X,\mathcal{A})$ and $(Y,\mathcal{B})$ be measurable spaces. In the definition of Markov kernel $K:X\times\mathcal{B}\rightarrow[0,1]$, it is required that
- $K(x,\cdot)$ is a probability measure on $(Y,\mathcal{B})$ for all $x\in X$.
- $K(\cdot,B)$ is $\mathcal{A}$-measurable for all $B\in\mathcal{B}$.
Intuitively (1) make sense since it's telling us that given some current state $x\in X$, $K(x,\cdot)$ give us a way to talk about the probability of the next state.
However, I'm not sure why (2) is needed, both intuitively and formally. How do I go about thinking about (2)?
Assume that instead of knowing the precise state $x$ we only know the probability distribution $\mu$ of the state which is a measure on $(X, \mathcal{A}$).
Then we want to ask, what is the probability that the next state is in some event $B \in \mathcal{B}$. The answer will be equal to $$\int K(x, B) d \mu (x),$$ if $K$ is the transition kernel of the Markov chain.
So assumption 2 makes sure that this integral is defined for any event $B$ and any distribution $\mu$.