Question about the definition of Markov kernel

Question

Question about the definition of Markov kernel

354 Views Asked by Bumbble Comm At 23 Feb 2026 - 3:06

Let $(X,\mathcal A)$ and $(Y,\mathcal B)$ be measurable space. A ''Markov kernel'' with source $(X,\mathcal A)$ and target $(Y,\mathcal B)$ is a map $\kappa : \mathcal B \times X \to [0,1]$ with the following properties:
(1) For every (fixed) $B \in \mathcal B$, the map $x \mapsto \kappa(B, x)$ is $\mathcal A$-measurable.
(2) For every (fixed) $x \in X$, the map $B \mapsto \kappa(B, x)$ is a probability measure on $(Y, \mathcal B)$.

Can someone please explain what this definition is saying? I am not getting the point, please if someone explain with example that will be great help. Thanks.

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2022-12-08 16:14:50

Markov kernels are just a way of expressing conditional distributions. The idea is that for each $x \in {X},$ you want to say that conditional law of some random variable $\bf Y$ given an observation of $\mathbf X = x$ is $\kappa(\cdot,x)$ --- this is sometimes denoted $\kappa(\cdot|x)$ instead. This is precisely the point of condition (2). $\kappa(B,x)$ is meant to represent $P(\mathbf Y \in B | \mathbf X = x)$.

In order to work with objects like $\kappa(B,x)$ as $x$ varies, the technical structure of probability theory requires that these conditional probabilities are not wild, in the sense that at least measure-theoretic issues don't arise. We would need this, for instance, to ensure that sets like $E = \{x: \kappa(B,x) \ge \tau\}$ are actually events. Note that such events are something we would often like to be able to deal with --- for example, if you observe $\mathbf Y \in B$, and you want to decide which $x$ are plausible for $\mathbf X$ given this observation, one reasonable answer is precisely $E$ for some value of $\tau$. The condition (1) says (more or less) that no matter what $B \in \mathcal{B}, \tau$ we choose, the sets $E$ belong to $\mathcal{A},$ which is precisely what is needed for things like $P(\mathbf X \in E)$ to make sense.

To sum up, Markov kernels are a formal way to set up conditional distributions. (2) is precisely the part of the definition that captures this aspect, while (1) is needed for technical reasons to ensure that the conditional distributions we treat are nice to work with.

**Bumbble Comm** · Answer 2 · 2022-12-08 16:35:15

@stochasticboy321's answer is a very good general answer, and I just want to add one other perspective from Markov chains. In this case, let's just simplify by assuming $X=Y$.

A Markov transition kernel $\kappa$ induces two operators:

the map $\mu \mapsto \mu \,\kappa$ on the space of probability measures defined by $$\mu \, \kappa(A) = \int_{X} \kappa(A,x) \ \mu(dx),$$
the map $f \mapsto \kappa f$ on the space of bounded measurable functions defined by $$\kappa f(x) := \int_{X} f(y) \ \kappa(dy,x).$$

Both of these operators are fundamental to the analysis of Markov chains. Briefly:

Given a distribution $\mu$, we can view $\mu$ as the "starting" distribution of the process, and $\mu \, \kappa$ is the subsequent distribution of running the process one step.
Given a bounded measurable function $f$, $\kappa f(x)$ computes the expected value of $f$ conditioned on a starting state $x$ and running the process one step.

The asssumptions we put on a Markov transition kernel are the minimum assumptions that make these operators well-defined.

Question about the definition of Markov kernel

There are 2 best solutions below

Related Questions in PROBABILITY-THEORY

Related Questions in STATISTICAL-INFERENCE

Related Questions in MARKOV-PROCESS

Related Questions in INFORMATION-THEORY

Related Questions in INFORMATION-GEOMETRY

Trending Questions

Popular # Hahtags

Popular Questions