I was reading materials about empirical distribution, on Wikipedia, it says empirical distribution is an example of empirical measure. On the page of empirical measure, it says the empirical measure is a random probability measure arising from a particular realization of sequence of random variable.
Up to now, it makes sense to me because by inspecting the definition of empirical distribution, it is:
$$\text { For real-valued iid random variables } X_{1}, \ldots, X_{n} \text {, empirical distribution is given by } F_{n}(x)=P_{n}((-\infty, x])=P_{n} I_{(-\infty, x]}$$
However, if I go deeper into the definition of what's a random probability measure, on the page it says the random probability measure is a measure-valued random element, (and can also be defined as a transition kernel) which starts to confuse me.
In its abstract form, a random probability measure is defined as :
A random measure $\zeta$ is a (a.s.) locally finite transition kernel from a (abstract) probability space $(\Omega, \mathcal{A}, P)$ to $(E, \mathcal{E})$ Being a transition kernel means that:
- For any fixed $B \in \mathcal{E}$, the mapping $ \omega \mapsto \zeta(\omega, B) $ is measurable from $(\Omega, \mathcal{A})$ to $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$
- For every fixed $\omega \in \Omega$, the mapping $B \mapsto \zeta(\omega, B) \quad(B \in \mathcal{E})$ is a measure on $(E, \mathcal{E})$
However, from reading this material, it seems to me one example of random element is conditional probability?
To put it in the context of the example, the kernel can be defined as the following when $\mathcal{X}$ is discrete:
$P_{x y}=P\left(X_{n}=y \mid X_{n-1}=x\right) \quad x, y \in \mathcal{X}$
To correspond this example with the definition:
for any given realization $x\in\mathcal{X}$, $P(X_n=x|X_{n-1}=y)$ is a function of $y$ and is therefore a random variable. More generally, the realization may be a subset $A$ of $\mathcal{X}$.
for any given $X_{n-1}=y$ or more generally $X_{n-1}\in A$, $P(X_n|X_{n-1}\in A)$ is a probability, and we have known how to calculate conditional probability for a very long time.
However, I am unable to draw an analogy between the above correspondence with the empirical measure. Does that mean if we know the underlying law of probability which governs the behavior of $X$, we will be able to obtain the probability/density of observing the relative frequency as we have observed? And since we don't know the exact law governing the behavior of X, the observed frequency table is still random depending on the underlying law of probability?
Note that $$ \mu_n(\omega,B):=P_n(B)(\omega)=\frac{1}{n}\sum_{i=1}^n 1\{X_i(\omega)\in B\}. $$ It is clearly random because it's a measurable function of random variables $X_1,\ldots,X_n$. One may check that $\mu_n$ is a Markov kernel from $(\Omega,\mathcal{A})$ to $(\mathbb{R},\mathcal{B}(\mathbb{R}))$, i.e., for any $B\in\mathcal{B}(\mathbb{R})$, $B\mapsto \mu_n(\,\cdot\,,B)$ is measurable, and for each $\omega\in\Omega$, $\omega\mapsto \mu_n(\omega,\,\cdot\,)$ is a (discrete) probability measure.