Let
- $(\Omega,\mathcal A,\operatorname P)$ be a probability space
- $(E,\mathcal E)$ be a measurale space
- $\mu$ be a probability measure on $(E,\mathcal E)$
- $X$ be an $(E,\mathcal E)$-valued random variable on $(\Omega,\mathcal A,\operatorname P)$
- $\kappa$ be a Markov kernel on $(E,\mathcal E)$
- $p\in[0,1]$
Assume we construct an $(E,\mathcal E)$-valued random variable $Y$ on $(\Omega,\mathcal A,\operatorname P)$ in the following way: With probability $p$ we draw $Y$ from $\mu$ and with probability $1-p$ we draw $Y$ from $\kappa(X,\;\cdot\;)$.
What's the conditional distributon $\operatorname P\left[Y\in\;\cdot\;\mid X\right]$ of $Y$ given $X$? In particular, I want to determine the Markov kernel $Q$ on $(E,\mathcal E)$ such that $$\operatorname P\left[Y\in B\mid X\right]=Q(X,B)\;\;\;\text{almost surely for all }B\in\mathcal E.\tag1$$
In order to give a rigorous answer, I think that we need to introduce a $\{0,1\}$-valued $p$-Bernoulli distributed random variable $Z$ on $(\Omega,\mathcal A,\operatorname P)$ such that
- $X$ and $Z$ are independent
- $X$ and $Y$ are independent given $\{Z=1\}$
- $\operatorname P\left[Y\in B\mid Z=1\right]=\mu(B)$ for all $B\in\mathcal E$
- $\operatorname P\left[Y\in B\mid X\right]=\kappa(X,B)$ almost surely on $\{Z=0\}$ for all $B\in\mathcal E$
At first glance, I thought this would be an easy task. However, I don't know how I need to proceed. First of all, is my (supposed to be equivalent) description of the problem with the random variable $Z$ correct or did I impose any false assumption?
If the description is correct, how do we need to proceed?
Please take note of this related question: I we sample with a fixed probability from a distribution, what does this theoretical rigorously mean?.
Some notation. When $\nu$ is a probability measure on a space $E$ and $\kappa$ is a Markov kernel on the same space, the semidirect product $\nu\rtimes \kappa$ is the measure on $E\times E$ (equipped with product $\sigma$-algebra) satisfying $$ (\nu\rtimes \kappa)(A\times B)=\nu(1_A\cdot \kappa 1_B). $$ It is the law of the first two steps of a Markov chain with initial distribution $\mu$ and transition kernel $\kappa$.
Formalizing the question. Let Ber$_p$ denote the probability measure on $\{0,1\}$ satisfying Ber$_p(\{1\})=p$. Consider the enlarged sample space $\Gamma=E^3\times \{0,1\}$ with the product $\sigma$-algebra, and equip $\Gamma$ with the probability measure $\mathbb P=\mu\otimes(\nu\rtimes \kappa)\otimes \textrm{Ber}_p$, where $\nu$ denotes the law of $X$.
Consider the function $f\colon \Gamma\to E$ given by $$ f(w,x,y,z)=\begin{cases}y,& z = 0\\ w,& z = 1\end{cases}. $$ When $f$ is regarded as a random element of $E$, it is precisely the result of "sampling from $\mu$ with probability $p$ and from $\kappa(X,\cdot)$ with probability $1-p$" in the way you have described.
Phrased in this precise and rigorous way, your question asks the following.
Reformulated question. For any $B\in\mathcal E$, determine the conditional probability $\mathbb P(f\in B\mid x)$.
You have guessed a formula for this conditional probability, which we will now verify.
Claim. The random variable $(1-p)\kappa(x, B)+p\mu(B)$ on $\Gamma$ is a version of $\mathbb P(f\in B\mid x)$.
In the proof of this claim, we will use notation like $\mathbb E[\textrm{variable};\textrm{conditions}]$ as a shorthand for the expectation of (variable times the indicator of the conditions) with respect to $\mathbb P$.
Proof. Unwinding the definition of conditional probability, the claim amounts to showing that $$ \mathbb P(f\in B,x\in A)=(1-p)\mathbb E[\kappa(x, B);x\in A]+p\mu(B)\mathbb P(x\in A)\tag{1}, $$ for all sets $A\in \mathcal E$. Splitting up the left side, we see that $$ \mathbb P(f\in B,x\in A)=\mathbb P(f\in B,z=0,x\in A)+\mathbb P(f\in B,z=1,x\in A). $$ On $z=0$, we have $f=y$ and on $z=1$, we have $f=w$. Thus $$ \mathbb P(f\in B,x\in A)=\mathbb P(y\in B,z=0,x\in A)+\mathbb P(w\in B,z=1,x\in A). $$ Using independence (coming from the product structure of $\mathbb P$) then yields $$ \mathbb P(f\in B,x\in A)=(1-p)\mathbb P(y\in B,x\in A)+p\mu(B)\mathbb P(x\in A). $$ Recalling that the law of $(x,y)$ is $\nu\rtimes \kappa$ and directly applying the definition of the semidirect product yields $\mathbb P(y\in B,x\in A)=\mathbb E[\kappa(x,B);x\in A]$. Substituting this into the previous display yields $(1)$, establishing the claim.