Why is estimating the proportion of Democrats the same as estimating the bias of a coin?

1.3k Views Asked by At

My textbook is talking about how estimating the proportion of Democrats in the population reduces to estimating the bias of a coin, which I wasn't seeing. Here is the paragraph I was reading:

Consider the problem of estimating the proportion $p$ of Democrats in the US population, by taking a small random sample. We can model this as the problem of estimating the bias of a coin above, where each coin toss corresponds to a person that we select randomly from the entire population. And the coin tosses are independent: we are assuming here that the sampling is done “with replacement”; i.e., we select each person in the sample from the entire population, including those we have already picked. So there is a small chance that we will pick the same person twice.

I am not seeing how the problem of estimating the proportion of Democrats is equivalent to estimating the bias of a coin. There are 2 things I am confused with here:

  1. When we have a bunch of coin tosses from the same coin, each toss has an equal propensity to be heads; but people in a population don't have an equal propensity to be Democrat. And modelling this example as estimating the bias of a coin assumes that the probability of every person being a Democrat is the same, it seems- are we treating the proportion of people which are Democrat as an equivalent notion to the propensity of a particular person to be Democrat?
  2. How is flipping a coin the same thing as sampling a person from the population? I see that while we could classify things into 2 possibilities in both cases, with Democrat and non-Democrat corresponding to heads and tails, there are an amount of possible outcomes equal to the amount of people in the population for sampling a person, making me feel that this is somehow different from when there are only 2 outcomes from flipping a coin.

I would be very grateful if anyone could explain how these situations are similar in a way which resolves my confusions.

3

There are 3 best solutions below

2
On BEST ANSWER

This is a really good question. Here is one way to think about it.

If you knew the actual proportion $p$ of Democrats in the population then you could design an unfair coin such that when you flipped it the probability of heads was $p$.

Now suppose you forget how you knew $p$ and used it to build the coin. There are two ways you might try to recover $p$. One would be to sample the population and use the fraction of Democrats as an estimate of $p$. The larger the sample the better the estimate.

Or you could flip the coin repeatedly and use the fraction of heads to estimate $p$. The more flips, the better the estimate.

Perhaps this helps.

0
On

In each illustration, that is, that of the biased coin and that of the political parties, we have a parameter we're attempting to estimate: the true bias of the coin and the true proportion of democrats in the population, respectively. In each case, the parameter is represented by a value $\theta$ such that $0 \le \theta \le 1$.

In each illustration above, we estimate the value of this parameter by sampling from the corresponding distribution and observing the results. In the case of the coin, we flip the coin $n$ times and observe the proportion of heads (or tails). Assuming the bias of the coin remains the same over all flips and the result of each flip is jointly independent of all others, we may construct a point estimate $\hat\theta$ of $\theta$ such that

$$ \hat\theta = \frac{\text{Number of results of heads (or tails)}}{n} $$

In the case of the political parties, we randomly choose one person from the population $n$ times and observe the proportion of democrats (or non-democrats). By randomly choosing people from the population and doing so with replacement, we ensure that the probability of selecting someone from either party remains the same with each selection. Furthermore, we ensure the selection of someone from either party is jointly independent of all other selections. Under such conditions, we construct a point estimate $\hat\theta$ of $\theta$ such that

$$ \hat\theta = \frac{\text{Number of democrats (or non-democrats)}}{n} $$

By now you're hopefully able to see the one-to-one correspondence between the two illustrations. The bias of the coin is not analogous to each individual person's unique propensity to be democrat; rather, the bias of the coin is analogous to the true proportion of democrats in the population, and this is precisely what we're trying to estimate through repeated sampling. I address some of your questions below:

When we have a bunch of coin tosses from the same coin, each toss has an equal propensity to be heads; but people in a population don't have an equal propensity to be Democrat.

Again, you seem to think we're estimating each individual person's unique inner propensity to be democrat. We are not. Instead, we're estimating the propotion of democrats in the population. And we do this by randomly selecting a sample of people from the populaiton and observing the proportion of democrats in the sample. Each selection of a person from the population is akin to a coin toss, and the probability of randomly selecting a democrat upon each selection remains the same just as the probability of heads remains the same during each coin flip.

Modelling this example as estimating the bias of a coin assumes that the probability of every person being a Democrat is the same, it seems- are we treating the proportion of people which are Democrat as an equivalent notion to the propensity of a particular person to be Democrat?

The probability of each person we randomly select being a democrat is a function of the true proportion of democrats in the populaiton. Similarly, the probability of each coin flip turning up heads is a function of the true bias of the coin.

How is flipping a coin the same thing as sampling a person from the population?

The proportion of heads in a sample of $n$ flips is an indication of the true bias of the coin. The proportion of democrats in a sample of $n$ randomly selected people (with replacement) is an indication of the true proportion of democrats in the population.

0
On

there are an amount of possible outcomes equal to the amount of people in the population for sampling a person, making me feel that this is somehow different from when there are only 2 outcomes from flipping a coin.

There are more than 2 possible outcomes of flipping a coin. It may spend different amounts of time in the air, or land in a different place. It's a choice to classify every outcome as "heads" or "tails", not a necessity. In the poll you likewise choose to classify everyone as "Democrat" or "not Democrat", ignoring their other differences.

modelling this example as estimating the bias of a coin assumes that the probability of every person being a Democrat is the same, it seems- are we treating the proportion of people which are Democrat as an equivalent notion to the propensity of a particular person to be Democrat?

We are assuming that everyone either is or isn't a Democrat, which means the propensity of each person to be a Democrat is 100% or 0%. I think you're imagining that you first select a person, and then there is a coin toss to determine whether they are a Democrat or not. Really, the selection process itself is the coin toss and there is no more uncertainty after that.