Motivate that if $PX|Y = y$ follows some distribution then, $P [X = k ] = \int \kappa(y, \{k \} ) f_Y(y) \, dy $

59 Views Asked by At

In introductory courses on probability one is often introduced to problems of the form:

Suppose $X|Y = y \sim \text{Geom}(y)$, where $Z \sim \text{Geom}(p)$ is the Geometric distribution $P(Z = k) = (1-p)^{k-1 }p$. Calculate the distribution of $X$.

If also $Y $ is discrete valued one could take $X|Y = y \sim \text{Geom}(y)$ to mean that

$$P[X = k | Y= y] = \frac {P [X= k, Y = y ]} {P [Y= y ]} = (1-y)^{k-1 } y $$

and to get $P [X= k ] $ one would sum this expression over all $y $ and multiply each term with $P [Y=y ]$. This in effect is an application of the fact that a probability measure is "$\sigma $-additive".


One is told that in a similar vein, when $Y $ has a continuous distribution, one should integrate over $y$ with respect to the density of $Y$.

I would like to motivate why this is so.

In the case when $Y $ is continuous, I suppose the framework is that we assume that there exists a probability kernel $\kappa(y, \{k \} )= (1 -y)^{k-1 }y$

And the claim would then be that

$$P [X = k ] = \int \kappa(y, \{k \} ) f_Y(y) \, dy $$

if a density exists, and otherwise

$$P [X = k ] = \int \kappa(y, \{k \} ) \, P_Y(dy)$$

what motivates this?

Thanks in advance!

1

There are 1 best solutions below

1
On BEST ANSWER

I'm not sure what you're looking for, but the main statement you're asking about is essentially the tower property of conditional expectation, i.e. that for any random variables $Z$ and $W$: $$ E[Z] = E[E[Z|W]]. $$ This is a more general version of what you describe above: "to get $P[X=k]$ [from $P(X=k|Y=y)$] one would sum this expression over all y and multiply each term with $P[Y=y]$". This is the same as saying that to get $P(X=k)$, you need to average $P(X=k|Y=y)$ over all possible values of $y$ (where the average is weighted according to the distribution of $Y$). In a similar way, to get $E[Z]$ from $E[Z|W]$, you need to average over all values of $W$. You can also think of $E[Z|W]$ as conditioning on $W$, and $E[E[Z|W]]$ as unconditioning, leaving $E[Z]$.

In the setting you're asking about, we can take $Z = 1_{\{X=k\}}$ (since then $E[Z] = P(X=k))$, and $W = Y$. The above property is then $$ P(X=k) = E[1_{\{X=k\}}] = E[E[1_{\{X=k\}}|Y]] = E[P(X=k|Y)]. $$ Note, $P(X=k|Y)$ is a function of both $k$ and $Y$; you can write it as $\kappa(Y,k) = P(X=k|Y)$ similarly to what you did above. And then $$ P(X=k) = E[\kappa(Y,k)] = \int \kappa(y,k) P_Y(y) $$ by the usual way of computing the expectation of a function of $Y$ (this is called the law of the unconscious statistician in undergrad probability, or the change-of-variable formula in graduate probability).

Does this help?