In yesterday's Iowa Caucus, Hillary Clinton beat Bernie Sanders in six out of six tied counties by a coin-toss*. I believe we would have heard the uproar about it by now if this was somehow rigged in her favor, but I wanted to calculate the odds of this happening, assuming she really was that lucky, and assuming she rigged various numbers of the tosses.
* As many people have pointed out already, this turned out to be a selective data set - Sander's won just about as many coin tosses as Mrs. Clinton did. Read on if you still care about the problem.
At first I calculated the odds using simple rules for the probabilities of independent events: $$ P(6\text{H})=6*P(\text{H})=\left( \frac{1}{2} \right)^{6}= \frac{1}{64} \approx 1.56\% $$ i.e. naively, there was a 1.56% chance it was fair.
But I vaguely remembered from reading about Bayesian inference that we can make a more educated statement about whether or not this was fair using Bayes' Theorem, and assuming various numbers of the coin tosses were rigged.
I tried it out myself, and here's what I came up with, but I'm fairly positive I worked this out incorrectly, so here's hoping you wonderful people can help. Here's my shot at it:
Example assuming it was fair (0% chance it was rigged): $$ P(6\text{H}) = \underbrace{P(6\text{H}|\text{fair})}_{1/64}\underbrace{P(\text{fair})}_{1} + \underbrace{P(6\text{H}|\text{not fair})}_{1}\underbrace{P(\text{not fair})}_{0} = \frac{1}{64} $$ and by Bayes Theorem: $$ P(\text{fair}|6\text{H}) = \frac{P(6\text{H}|\text{fair})P(\text{fair})}{P(6\text{H})} = \frac{(1/64)(1)}{(1/64)}=1 $$ (obviously).
Assuming $n$ of the tosses were rigged: $$ P(6\text{H}) = \underbrace{P(6\text{H}|\text{fair})}_{\left(\frac{1}{2}\right)^{6}}\underbrace{P(\text{fair})}_{1-\frac{n}{6}} + \underbrace{P(6\text{H}|\text{not fair})}_{1}\underbrace{P(\text{not fair})}_{\frac{n}{6}} = \frac{6-n}{384} + \frac{n}{6} = \frac{63n+6}{384} $$ and by Bayes' Theorem: $$ P(\text{fair}|6\text{H}) = \frac{P(6\text{H}|\text{fair})P(\text{fair})}{P(6\text{H})} = \frac{\left(\frac{1}{64}\right)\left(\frac{6-n}{6}\right)}{\left(\frac{63n+6}{384}\right)}=\frac{6-n}{63n+6} $$
Here's a plot of the probabilities that the coin tosses were fair given an assumption of $n$ unfair coins:
Questions:
- I'm pretty sure some of my assumptions for probabilities were off in various parts of this - if so, where did I go wrong?
- On the off chance I carried this out correctly, what can be made of these results? For example, is it most probable that there were 0, 1, or 2 coin tosses that were unfair, as making the assumption that there were $n<3$ unfair coins gives a probability $P(\text{fair}|6\text{H})$ greater than the $1/64$ chance it was fair?
EDIT:
@Eric Wofsey Informed me that I was calculating the wrong probability. What I really wanted to calculate was $P(0|6H)$, the probability of 0 coins being rigged, considering an outcome of 6 heads. What I learned (I'm new to Bayesian inference) is that it all depends upon your prior guess as to the probability that n of the coins were rigged. As he pointed out: $$ P(0|6H) = \frac{P(0)}{\sum_{i=0}^{6}2^iP(i)} $$ where $$ P(n) = {6 \choose n}p^n(1-p)^{6-n} $$ and $p$ is the prior probability that each coin toss was rigged. Here's what $P(0|6H)$ looks like fully expanded (assuming the prior $p$ is the same for each $P(n)$):
As I learned, the prior probability is arbitrarily chosen, and represents your belief/guess as to the likelihood that the coins were rigged.
I was interested in looking at what the distribution of $P(0|6H)$ looked like for values of $p$ from 0 to 1 (0 meaning you believe there's no possibility the coins were rigged, 1 meaning you're certain the coins were rigged). Here's the plot:
I may be going way off the reservation here, but if this graph represents values of $P(0|6H)$ for prior probabilities of having rigged coins, wouldn't the integral of this from $p=0$ to $1$ represent the total probability of 0 rigged coins, considering an outcome of 6 heads, with each prior $p$ weighted equally? Whether or not I'm abusing the maths, the integral evaluates to: $$ \int_{0}^{1} P(0|6H)(p) \ dp = 0.0822\dots $$
Note: I'm thinking in retrospect that the prior $p$ should probably be different for every $P(n)$ and assuming they're the same for each $P(n)$ is likely problematic, but I thought I'd share my process anyway.
EDIT 2:
On further thought, it seems like what I really want to compute is the integral: $$\int{\int{\int{\int{\int{\int{\int P(0|6H) \ d p_0 \ d p_1 \ d p_2 \ d p_3 \ d p_4 \ d p_5 \ d p_6}}}}}}$$ where $$ P(0|6H) = \dfrac{(1-p_0)^6}{(1-p_0)^6 + 12p_1(1-p_1)^5 + 60p_2^2(1-p_2)^4 + 160p_3^3(1-p_3)^3 + 240p_4^4(1-p_4)^2 + 192p_5^5(1-p_5) + 64p_6^6} $$ and $$ p_0 + p_1 + p_2 + p_3 + p_4 + p_5 + p_6 = 1 $$ and $p_n$ is the prior probability that $n$ coins are rigged.
I have absolutely no idea how one would go about even thinking about evaluating this integral - it seems as though there are a range of values for the integral anyway, depending on the choices of $p_n$. It seems it is definitely possible given specific choices for $p_n$ and maybe even a distribution for the $p_n$s, dependent on n, such that the distribution is still normalized, like a weighted decaying distribution or something (gets less likely as n increases that that number of coins was rigged).
Happy Tuesday



Your computation doesn't make any sense. Assuming that $n$ of the tosses were rigged doesn't mean that you're assigning a prior probability of $n/6$ to "not fair". If you're saying the only possibilities are "fair" and "not fair" and "not fair" means a 100% of Clinton winning all $6$ tosses (which is what your computation of "$P(6H)$" implies), then that means either all the coins are rigged or none of them are, with $P(\text{not fair})$ being the prior probability that they are all rigged.
The computation that you seem to be trying to do when you're computing "$P(6H)$" is not $P(6H)$ but $P(6H|n\text{ rigged coins})$, i.e. the probability of getting $6$ heads assuming exactly $n$ of the coins were rigged. This is very easy to compute: it's just the probability of the $6-n$ non-rigged coins coming up heads, which is $1/2^{6-n}$. Note that if you allow this possibility, you are no longer saying the only options are "fair" and "not fair"; rather the options are "$n$ rigged coins" for each $n$ between $0$ and $6$ (with $n=0$ being "fair" and $n=6$ being what you called "not fair"). You then get that $$P(6H)=P(6H|0)P(0)+P(6H|1)P(1)+\dots+P(6H|6)P(6)=\frac{P(0)}{2^6}+\frac{P(1)}{2^5}+\dots+ P(6),$$ where I am abbreviating the event "$n$ rigged coins" as simply "$n$". You then get that $$P(0|6H)=\frac{P(6H|0)P(0)}{P(6H)}=\frac{P(0)}{P(0)+2P(1)+\dots+2^6P(6)}$$ is the probability that all the coins were fair, given that they all came up heads. Note here that $P(0),P(1),\dots,P(6)$ are prior probabilities: the probability (before you knew the outcome of the coin tosses) with which you believed that $n$ of the coin tosses were rigged. You don't get a value for $P(0|6H)$ until you plug in values for these priors. If, for instance, you believe that each coin toss independently had a prior probability of $p$ of being rigged, then $P(n)=\binom{6}{n}p^n(1-p)^{6-n}$. However, this is probably not a reasonable assumption (you wouldn't expect the riggedness of each toss to be independent--if one of them is being rigged, then that makes it more likely there is a conspiracy which means more of them will be rigged).