How to make sense of $\mathbb{P} (E) = \int _{-\infty} ^{+\infty} \mathbb{P} (E | Y=y) f_Y (y) \ dy$?

80 Views Asked by At

Let $Y$ be a random variable and $E$ an event. In case $Y$ is discrete, I know that $$\mathbb{P} (E) = \sum _{y} \mathbb{P} (E | Y=y) \ \mathbb{P} (Y=y)$$ is just a variant on the Law of Total Probability.

However, for continuous $Y$ I often see the formula, $$\mathbb{P} (E) = \int _{-\infty} ^{+\infty} \mathbb{P} (E | Y=y) f_Y (y) \ dy,$$ where $f_Y$ is the probability density function of $Y$.

How do I make sense of the conditional probability $\mathbb{P} (E|Y=y)$, since $\mathbb{P} (E \cap \{ Y=y\})$ divided by $\mathbb{P} (Y=y)$ yields $0/0$? And how can I prove the above formula? Is there a simple explanation, or do I need measure theoretic probability theory in order to comprehend this?

2

There are 2 best solutions below

0
On BEST ANSWER

There are different ways to define this "conditional probability" with respect to the zero-mass event $Y=y$ which agree in appropiate sense.

1) If you know the notion of conditional expectation, recall the following statement:

Let $(\Omega, \mathcal F)$ be a measurable space and $Y:\Omega \to \mathbb R$, $ g:\Omega\to \mathbb R$ measurable functions such that $g$ is in fact $\sigma (Y)$-measurable. Then there exists an measurable function $h:\mathbb R \to \mathbb R$ such that $g = h(Y)$.

In this terms, take $g = \mathbb P (E \vert Y) $, thus there exists $h$ such that $\mathbb P (E\vert Y) = h(Y) $. Then define $\mathbb P (E \vert Y= y) := h(y)$.

2) Second possibility is the way of Lucas' answer.

3) Another matching definition can be defining exactly what you need:

If $X$ is another random variable (not singular) let $f(x,y)$ be the joint density of $X,Y$ (with respect to $\mu \otimes \lambda$ where $\mu \in \{\lambda , \chi\}$ , where $\lambda$ denotes the Lebesgue-measure and $\chi$ the counting measure). Then define

$$\mathbb E (X \vert Y = y) := \int_{\mathbb{R}} \frac{f(x,y)}{f_Y(y)} \text d \mu (x)$$

All in all in my opinion Lucas' way is the most intuitiv one. But with 1) and 3) the "law of total probability" is immediatly fulfilled.

0
On

Try thinking like: $$P(E|y < Y < y+\Delta y) = \frac{P(E \cap \{y< Y < y+\Delta y\})}{P(y<Y<y+\Delta y)}$$ and send $$\Delta y \rightarrow 0$$. The numerator and denominator both go to zero but they do so at the same rate so the conditional cross section is not zero.