Where's the "update" in Bayes?

117 Views Asked by At

I understand Bayes theorem that I have a prior belief about a hypothesis: $$\Bbb P_0=P(H)$$

But I really don't understand when literature says prior probabilities are "updated" (??)

Does this mean the term $\Bbb P_0=P(H)$ is being replaced by $P(H|E)$ everytime?

So next time I observe new evidences $E_1...E_n$: $$\Bbb P_1=P(H|E_1)= \frac {P(E_1|H)}{P(E_1)}.\Bbb P_0$$ $$\Bbb P_2= P(H|E_2)=\frac {P(E_2|H)}{P(E_2)}.\Bbb P_1$$ $$\Bbb P_3= P(H|E_3)=\frac {P(E_3|H)}{P(E_3)}.\Bbb P_2$$ $$...$$ $$\Bbb P_n= P(H|E_n)=\frac {P(E_n|H)}{P(E_n)}.\Bbb P_{n-1}$$

I'm confused as what happens to $P(H)$?

UPDATE

If one substitutes above equalities to last one, it will look like this: $$\Bbb P_n= P(H|E_n)=\frac {P(E_n|H)P(E_{n-1}|H)...P(E_1|H)P(H)}{P(E_n)P(E_{n-1})...P(E_1)}$$ $$= \frac {P(H|E_n)P(E_n).P(H|E_{n-1})P(E_{n-1})...P(H|E_1)P(E_1).P(H)}{P(E_n)P(E_{n-1})...P(E_1).P^n(H)}$$ $$\Rightarrow P(H|E_{n-1})...P(H|E_1)=P^{n-1}(H)$$

This doesn't look right.

2

There are 2 best solutions below

2
On BEST ANSWER

That's almost correct, but there are two issues to how we want to update:

  • We really want $\mathbb P_n = P(H \mid E_1 \land E_2 \land \dots \land E_n)$, not $\mathbb P_n = P(H \mid E_n)$: our final probability should take into account all the evidence, not just the last piece of evidence.
  • Evidence might not be independent: when updating on $E_n$, we should take into account that we've already seen $E_1, E_2, \dots, E_{n-1}$. (As an extreme case, if $E_1, E_2, \dots, E_n$ were all the same piece of evidence $E$, we would not want to multiply by $\frac{P(E \mid H)}{P(E)}$ $n$ times, just once.)

Taking this into account, the formulas for $\mathbb P_1, \mathbb P_2, \mathbb P_3, \dots$ instead look like:

\begin{align} \mathbb P_1 &= P(H \mid E_1) = \frac{P(E_1 \mid H)}{P(E_1)} \cdot \mathbb P_0 \\ \mathbb P_2 &= P(H \mid E_1 \land E_2) = \frac{P(E_2 \mid H \land E_1)}{P(E_2\mid E_1)} \cdot \mathbb P_1 \\ \mathbb P_3 &= P(H \mid E_1 \land E_2\land E_3) = \frac{P(E_3 \mid H\land E_1 \land E_2)}{P(E_3\mid E_1\land E_2)} \cdot \mathbb P_2 \\ \dots &= \dots \end{align}

Substituting them into each other and using the rule $P(A \land B \mid C) = P(A \mid C)\, P(B \mid A \land C)$, we get the following formula for finding $\mathbb P_n$ directly: $$ \mathbb P_n = P(H \mid E_1 \land E_2 \land \dots \land E_n) = \frac{P(E_1 \land E_2 \land \dots \land E_n \mid H) P(H)}{P(E_1 \land E_2 \land \dots \land E_n)}. $$

0
On

Your prior is updated sequentially by making the prior at step $t$ equal to the posterior at step $t-1$. So initially, you begin with some prior $p(H)$ based on no data, usually this is an uninformative prior. We then observe one or more data points, $E_i$, and use Bayes rule to compute a posterior. Think of this as round $1$. If it turns out that we will observe more data in the future, then we repeat the experiment with our old posterior acting as our current prior, which is exactly how you have defined the $\mathbb{P}_n$ recursion. This is why people think of Bayes rule as allowing us to update our beliefs as we get more data (updating sequentially/online)