Confusion about continuous conditional densities

253 Views Asked by At

I have a long lasting confusion about the definition of conditional probability when we have continuous variables. Let's assume we have a density function $f_{X,Y}$ such that $P(a < X < b, c < Y < d) =\int_{c}^{\ d}\int_{a}^{\ b}f_{X,Y}(x,y)dxdy$. Then we can define the conditional probability density function $f_{X|Y=y}$ as $f_{X|Y=y}(x) =\displaystyle\frac{f_{X,Y}(x,y)}{\int_{-\infty}^{\ \infty}f_{X,Y}(x,y)dx}=\frac{f_{X,Y}(x,y)}{f_{Y}(y)}$.

Now, it is intuitive to think that we can calculate the probability of $X$ being in an interval $[a,b]$ given $Y=y$ by $P(a < X < b | Y = y) =\int_{a}^{b}f_{X|Y=y}(x)dx$.

This same conditional probability can be shown with $P(a < X < b | Y = y) = \displaystyle\frac{P(a < X < b , Y = y)}{P(Y=y)}$. But this is equal to $\displaystyle\frac{\int_{y}^{\ y}\int_{a}^{\ b}f_{X,Y}(x,y)dxdy}{\int_{y}^{\ y}\int_{-\infty}^{\ \infty}f_{X,Y}(x,y)dxdy}$. Since $P(Y=y)$ is equal to the probability mass of a straight line on the $X,Y$ plane it is equal to $0$. This makes the conditional probability undefined.

So, I get confused here. While $\int_{a}^{b}f_{X|Y=y}(x)dx$ looks like computing the conditional probability $P(a < X < b | Y = y)$ correctly, we obtain a division by zero if we try to compute it by using the joint density $f_{X,Y}$. What is the part I am missing here? Doesn't the expression $\int_{a}^{b}f_{X|Y=y}(x)dx$ compute a probability value? Are these two calculating different things?

Thanks in advance

2

There are 2 best solutions below

2
On BEST ANSWER

Your notation is a bit confusing. It is better to separate random variables (usually upper case) from integration variables (usually in lower case).

If the random vector $(X,Y)$ has density $f_{X,Y}$ then the conditional density given $Y = y$ is formed by integrating over the variable associated to $X$. $$ f_{X|Y=y}(x) = \frac{f_{X,Y}(x,y)}{\displaystyle \int_{-\infty}^\infty f_{X,Y}(x,y)\, dx}$$

Of course if you have in general a random variable $X$ with density (with respect to the Lebesgue measure), the events of type$\lbrace X = x \rbrace$ have zero probability.

Thus, we must construct the formula for the conditional density as a limit. The following (non-rigourous) argument captures the main idea. Let's consider $\varepsilon$ small, then: $$P(X \in A \,|\, Y \in [y_0-\varepsilon, y_0+\varepsilon]) = \frac{P(x \in A \, , \, Y \in [y_0-\varepsilon, y_0+\varepsilon])}{P(Y \in [y_0-\varepsilon, y_0+\varepsilon])} =\frac{\displaystyle \int_A \int_{y_0 - \varepsilon}^{y_0 + \varepsilon} f_{X,Y}(x,y) \, dxdy }{\displaystyle\int_{-\infty}^\infty \int_{y_0 - \varepsilon}^{y_0 + \varepsilon}f_{X,Y}(x,y) \, dxdy } =\frac{\displaystyle\frac{1}{2\varepsilon}\int_A \int_{y_0 - \varepsilon}^{y_0 + \varepsilon} f_{X,Y}(x,y) \, dxdy }{\displaystyle\frac{1}{2\varepsilon}\int_{-\infty}^\infty \int_{y_0 - \varepsilon}^{y_0 + \varepsilon}f_{X,Y}(x,y) \, dxdy }$$

Taking $\varepsilon \to 0$ then $$P(X \in A | Y = y_0) = \frac{\displaystyle\int_A f_{X,Y}(x,y_0) \, dx}{\displaystyle\int_{-\infty}^{\infty}f_{X,Y}(x,y_0) \, dx}$$

and you can derive from here the expression for the conditional density $f_{X|Y=y_0}(x)$ by taking $A = (-\infty, x]$ and deriving (the non-rigourous part is passing the limit inside the integral).

2
On

Your confusion is quite understandable. My answer is not complete but might give some enlightment.

Especially condition $P\left(Y=y\right)=0$ takes away the possibility to look at conditional probabilities like $P\left(a<X\le b\mid Y=y\right)$ if it would be defined it as: $\dfrac{P\left(a<X\le b\wedge Y=y\right)}{P\left(Y=y\right)}$.

In fact if $A$ and $B$ are two events and $P\left(B\right)=0$ then equality $P\left(A\mid B\right)P\left(B\right)=P\left(A\cap B\right)$ holds for any value of $P\left(A\mid B\right)$.

The mathematician could give up here and leave these 'conditionals' out of sight. His intuition however makes it impossible to do so. So he goes searching for another route and finds one: $$P\left(a<X\leq b\mid Y=y\right)=\int_{a}^{b}f_{X\mid Y=y}\left(x\right)dx$$ A convenient value for $P\left(a<X\leq b\mid Y=y\right)$ has been found such that $$P\left(a<X\leq b\mid Y=y\right)P\left(Y=y\right)=P\left(a<X\leq b\wedge Y=y\right)$$ holds.

Convenient in mathematical sense and also in the sense that it meets the intuition.