Delta functions/Probability

235 Views Asked by At

Here in an answer they say

Now note that its perfectly reasonable to have a prior that's say 2 delta functions at p=0.23 and p=0.88. Combining this prior with a likelihood coming from an observation of an H or T results in some strange function class, which is valid as a posterior as well. As you can see from the above example, using conjugate priors has nice properties, where it might be easy to say build a sequential estimation algorithm that updates the belief about the probable value of p every time you get a new observation. This wouldn't be computationally very easy had you started off with a prior that was 2 delta functions.

Question:I do not follow why the two probabilities at $p=0.23$ and $p=0.88$ need not sum up to $1$?

1

There are 1 best solutions below

0
On BEST ANSWER

Suppose you start convinced that $p=0.23$ or $p=0.88$ and your prior probability distribution is that the first is the case with probability $q$ and the second with probability $1-q$; these have $q+(1-q)=1$. Perhaps you have two precision-made biased coins which look identical, and you do not know which one is in your hand

You then experiment and get $h$ successes (heads) and $t$ failures (tails). Your posterior probability that in fact $p=0.23$ should now be $$\dfrac{q \,0.23^h \, 0.77^t}{q \,0.23^h \, 0.77^t +(1-q) \,0.88^h \, 0.12^t}$$ and the posterior probability for $p=0.88$ is similarly $\frac{(1-q) \,0.88^h \, 0.12^t}{q \,0.23^h \, 0.77^t +(1-q) \,0.88^h \, 0.12^t}$ with these two expressions adding up to $1$. I would call this computationally easy

If you thought originally that each possibility was equally likely so your prior had $q=\frac12$, then the $q$s and $(1-q)$s can be cancelled, slightly simplifying the expression. For example if you had

  • $q=\frac12$ and $h=11$ and $t=9$ you would get a posterior probability for $p=0.23$ of about $0.8776$ and a posterior probability for $p=0.88$ of about $0.1224$
  • $q=\frac12$ and $h=12$ and $t=8$ you would get a posterior probability for $p=0.23$ of about $0.2260$ and a posterior probability for $p=0.88$ of about $0.7740$

showing how sensitive the posterior is to the experimental results. Both these outcomes would be rather unlikely if you were correct to start convinced that $p=0.23$ or $p=0.88$ were the only possible realities