Can someone explain to me how we derive the alternative form of Bayes theorem?

917 Views Asked by At

I understand how we get this formula

$$\Pr(H\mid E) = \frac{\Pr(H)\Pr(E\mid H)}{\Pr(E)}$$

from the fact that $\Pr(H\cap E)$ is equal to both $\Pr(H)\Pr(E\mid H)$ and $\Pr(E)\Pr(H\mid E),$ and solving for $\Pr(H\mid E).$

But how do we go from the denominator in the boldfaced formula to this new denominator below

$$\Pr(E) =\Pr(H)\Pr(E\mid H)+\Pr(\bar H)\Pr(E\mid \bar H)$$

?

Please explain it to me like I'm ten years old. I'm new to Bayes and learned the above after going over Venn diagrams several times, so break it down for me.

Thanks so much.

2

There are 2 best solutions below

0
On

Look at the probability tree diagram:

enter image description here

$$\begin{align}P(H|E)=\frac{P(H\cap E)}{P(E)}=&\frac{P(E\cap H)}{P(E\cap H)+P(E\cap (-H))}=\\ &\frac{\color{red}{P(H)\cdot P(E|H)}}{P(H\cap E)+P((-H)\cap E)}=\\ &\frac{\color{red}{P(H)\cdot P(E|H)}}{\color{red}{P(H)\cdot P(E|H)}+\color{blue}{P(-H)\cdot P(E|(-H))}}. \end{align}$$

0
On

It is called the Law of Total Probability.

Since the union of any event $H$ and its complement $\overline H$ comprise the total sample space (by definition), and the intersection of any event $E$ and the sample space equals the event, then clearly:

$$\begin{split}E&=E\cap (H\cup\overline H)\\ & = (E\cap H)\cup(E\cap \overline H)\end{split}$$

Now, the two componenets of that union are disjoint, and the probability for a union of disjoint events is the sum of the probabilities for those events, so:$$\mathsf P(E)=\mathsf P(E\cap H)+\mathsf P(E\cap\overline H)$$ Then it is just a matter of applying the definition for conditional probability.$$\mathsf P(E)=\mathsf P(E\mid H)~\mathsf P(H)+\mathsf P(E\mid\overline H)~\mathsf P(\overline H)$$

And thus we have that: $$\mathsf P(H\mid E)=\dfrac{\mathsf P(E\mid H)~\mathsf P(H)\hspace{18ex}}{\mathsf P(E\mid H)~\mathsf P(H)+\mathsf P(E\mid\overline H)~\mathsf P(\overline H)}$$


In general when you have a sequence of events, ${(B_k)}_{k=1}^n$, which partition the sample space (ie: are pairwise disjoint and exhaustive), then similarly:

$$\begin{split}\mathsf P(E) &=\mathsf P(E\cap\bigcup_{k=1}^n B_k)\\ &=\sum_{k=1}^n \mathsf P(E\mid B_k)\,\mathsf P(B_k)\end{split}$$