Coin toss with unknown probability – Bayesian interpretation

1.1k Views Asked by At

I have observed a coin being tossed $n$ times. I do not know whether the coin is fair or not, but in every single toss I observed, the coin came up heads.

What should my belief about $p$ (the probability that the coin shows heads) be now? I cannot even say with certainty that $p>0$, since even an event with $p=0$ can occur. The frequency of heads is most compatible with $p=1$, but I doubt that is the best guess, especially if $n$ is low (it would be ridiculous to assume that $p=1$ after seeing a single heads only).

How can this be handled in a Bayesian framework? What is my best guess for the true value of $p$?

2

There are 2 best solutions below

1
On BEST ANSWER

This depends on the a priory assumption about $p$. If it is uniformly distributed a priory (i.e. $P(p<a)= a$ for $0\le a\le1$), then the probability of seeing $n$ heads in a row is $$\int_0^1 p^n \,\mathrm dp=\frac1{n+1}.$$ The probability of $n$ heads and $p<a$ is $$\int_0^a p^n \,\mathrm dp=\frac1{n+1}a^{n+1}.$$ Then the probability of $p<a$ given that we observe $n$ heads is $$ P(p<a\mid n\text{ heads})=\frac{\frac1{n+1}a^{n+1}}{\frac1{n+1}}=a^{n+1}.$$ In other words: With every head we observe, the cdf of $p$ is just a $(n+1)$th power and thus shifts further and further topwards $1$. We see that there is a 50% chance that $p>\frac1{\sqrt[n+1]2}$ and the most likely $p$ is indeed $1$ - even after a single head! If you find that counterintuitive it is because in the back of your head you don't start with all values of $p$ equally likely but rather with a huge bias towards "more or less" fair coins.

0
On

Bayesianism is adherence to a degree-of-belief interpretation of probability rather than to a frequency interpretation.

Hagen von Eitzen's answer is correct if the prior degree of belief about $p$ is expressed by a uniform distribution.

The physicist Edwin Jaynes once argued in a paper that if one has never suspected either outcome of existing until one of them is observed, then that epistemic situation should be modeled by using $$ \frac{dp}{p(1-p)} \tag 1 $$ as the prior distribution. That is NOT a probability distribution since it assigns infinite measure to the whole space. If you observed heads ten times, the posterior would then be $$ \frac{p^9\,dp}{1-p}, $$ which is still not a probability distribution. At this point one is in the epistemic state of never having even suspected that the black swan --- the tails outcome --- is a possibility. But if one has tried twice and observed heads once and tails once, then one knows that both possible outcomes exist, and application of Bayes' formula to the prior $(1)$ yields the uniform distribution as the posterior.

If your epistemic state is like that --- knowing ONLY that those two outcomes are possible --- then Jaynes' argument would lead to the conclusion that the uniform distribution is the right prior.

Historically, in Thomas Bayes' famous posthumous paper that appeared in 1763, two years after his death, the uniform prior and the Beta posterior resulting from just this kind of experiment, was the only problem considered. It was in that paper that Bayes derived the result that $$ \int_0^1 \binom n k x^k(1-x)^{n-k}\,dx = \frac{1}{n+1} $$ by the method that I described here.