biased coin once more: maximum likelihood estimation under constraints

53 Views Asked by At

Suppose we have two coins A and B, A is biased and has a probability of P(Head|A)=0.8 and P(Tail|A)=0.2, while coin B is unbiased so P(Head|B)=P(Tail|B)=0.5. It's also known that the frequencies of A and B are 2:1, so we have two constraints: $$P(A)=2P(B)$$ $$P(A)+P(B)=1$$ Now let's see we have three observations Head-Head-Head and try to find out which coin generates each observation.

Since P(A)P(Head|A) is larger than P(B)P(Head|B), if we use maximum likelihood, then all observations should be assigned to coin A, however, this violates the constraint that P(A)=2P(B).

How to reconcile this contradiction? Thanks!

1

There are 1 best solutions below

3
On BEST ANSWER

Some of this is somewhat unclear. I take it that by “frequencies” you mean the relative frequencies of these coins in some population of coins that the coins are being drawn from; and I gather that each of the three observations is made with a separate coin that is drawn from this population independently of the coins for the other two observations.

If so, $P(A)=2P(B)$ is not a constraint on the coins actually drawn. Perhaps you feel that the coins drawn should reflect the proportions of the types in the population; but this is already being taken into account in the calculation.

To throw some more light on this, let’s take a more extreme example where $10$ observations all came out as heads.

Now you can ask different questions:

You can ask, for each observation separately, which type of coin likely generated it. The answer is of course, as you state, in each case $A$.

Or you can ask for all observations together which tuple of assignments of types to the coins makes the observations as a whole most likely. This factors into the individual questions above, so the answer is again that the most likely tuple of assignments is $(A,A,A,A,A,A,A,A,A,A)$.

Now you might say: “But isn’t that a very unlikely occurrence, that $10$ coins drawn were all of type $A$ when only $\frac23$ of the population are?” It is, but remember that here we’re comparing the likelihood of $(A,A,A,A,A,A,A,A,A,A)$ with the likelihood e.g. of $(B,A,A,A,A,A,A,A,A,A)$; and that tuple is even less likely to occur! (Namely by a factor of $2$.)

But you could also ask which number of coins of each type is most likely to have been drawn. This question bunches all the tuples with one $B$ and $9$ $A$s into one category. The probabilities in this case are

$$ P(\text{$k$ times $A$ drawn})=\binom{10}k\left(\frac23\right)^k\left(\frac13\right)^{10-k} $$

and

$$ P(\text{$10$ heads}\mid\text{$k$ times $A$ drawn)}=\left(\frac45\right)^k\left(\frac12\right)^{10-k}\;, $$

so the product is

\begin{eqnarray} P(\text{$10$ heads}\cap\text{$k$ times $A$ drawn)} &=& P(\text{$k$ times $A$ drawn})P(\text{$10$ heads}\mid\text{$k$ times $A$ drawn)} \\[10pt] &=& \binom{10}k\left(\frac8{15}\right)^k\left(\frac16\right)^{10-k}\;. \end{eqnarray}

This has a maximum at $k=8$. So even though the most likely assignment of types is $(A,A,A,A,A,A,A,A,A,A)$, the most likely number of $A$s is $8$, not $10$.

Thus, to what extent the proportion of the coins in the population influences the answer depends on which question you ask.