Form of Bayes' equation for continuous random variables

942 Views Asked by At

Bayes' theorem says that, if $X$ and $Y$ are discrete random variables, then

$${\displaystyle P(A\mid B)={\frac {P(B\mid A)P(A)}{P(B\mid A)P(A)+P(B\mid \neg A)P(\neg A)}}.}$$

I was then told that, if $X$ and $Y$ are continuous random variables, then Bayes' theorem says that

$$f_{X \vert Y}(x \vert y) = \dfrac{f_{Y \vert X}(y \vert x) f_X(x)}{\int f_{Y \vert X} (y \vert x') f_X(x') \ dx'}.$$

Seeing this latter form made me wonder: Why is the denominator $\int f_{Y \vert X} (y \vert x') f_X(x') \ dx'$ instead of $\int f_{Y \vert X} (y \vert x) f_X(x) \ dx + \int f_{Y \vert X} (y \vert x') f_X(x') \ dx'$? It seems like, given the form of Bayes' equation for discrete random variables, it would be reasonable to have the form

$$f_{X \vert Y}(x \vert y) = \dfrac{f_{Y \vert X}(y \vert x) f_X(x)}{\int f_{Y \vert X} (y \vert x) f_X(x) \ dx + \int f_{Y \vert X} (y \vert x') f_X(x') \ dx'}?$$

Could people please help me understand why this is the case?

Thank you.

2

There are 2 best solutions below

0
On BEST ANSWER

Here, the prime ($'$) is used to signify that the term is distinct from $x$; although it does not need to be, using a distinct term is intended to avoid confusion.   It does not indicate any form of complementation.   (Or derivation for that matter.   I consider using prime in this setting to be inadvisable.)

Remember that the variable of integration is substitutable with any other term that does not occur free within the integrand.

$$\int g(x,y)~\mathrm d x=\int g(s,y)~\mathrm d s=\int g(x',y)~\mathrm d x'=\int g(\xi,y)~\mathrm d\xi$$


Additionally, your presentation of Bayes' Rule was for events $A,B$, rather than discrete random variables.

For discrete random variables, $X,Y$, the summation is over all supported values for $X$ (those supported by the marginal probability mass function).

$$\mathsf P(X=x\mid Y=y)=\dfrac{\mathsf P(Y=y\mid X=x)~\mathsf P(X=x)}{\sum_{s}\mathsf P(Y=y\mid X=s)~\mathsf P(X=s)}$$

For continuous random variables, the integration is analogous to summation.

$$f_{X\mid Y}(x\mid y)=\dfrac{f_{Y\mid X}(y\mid x)~f_X(x)}{\int f_{Y\mid X}(y\mid s)~f_X(s)~\mathrm d s}$$

0
On

Notice that in the discrete case the denominator is the total probability with which event $B$ occurs (the first term is "$B$ occurs and $A$ occurs" while the second one is "$B$ occurs and $A$ does not occur"). Analogously, in the continuous case the denominator is the overall density around event $Y=y$. We cannot say "probability with which event $Y$ occurs any more because this is zero.

A way to express Bayes's rule in the discrete case consistent with the formula for the continuous case would be $$P(A|B)=\frac{P(B|A)P(A)}{\sum_{\omega\in\Omega}P(B|\omega)P(\omega)}$$ where $\Omega$ is the sample space.

As Graham Kemp points out, do not get confused by the $'$ notation of the dummy variable in the integral.