A Probability problem regarding elections with voters in two villages

259 Views Asked by At

The problem

We have Village A and Village B, both having unspecified population. It is known that in every election 49% of the villagers in Village A always vote for Democrats and the rest 51% always vote for Republicans.

Similarly for Village B 51% vote for Democrats and 49% for Republicans.

We randomly select 100 people from each village.

What is the probability that there are more "Democrats" in the 100 selected from Village A when compared to those selected from Village B?


My Attempt

I tried to count the number of favorable cases divided by total number of cases.

Total number of cases would be $${n \choose 100} \times {m \choose 100}$$ Assuming the populations of Village A and B are $n$ and $m$ respectively. I hoped that it would simply off due to presence of $n$ and $m$ in the numerator of the probability.

The total number of favorable cases ended up becoming a long series. Here I assume both $n$ and $m$ are larger than $100$.

$$\sum_{i=1}^{100} \left [ {0.49n \choose i} \times {0.51n \choose 100-i} \sum_{j=0}^{j-1} {0.51m \choose j} \times {0.49m \choose 100-j} \right ]$$

This literally means that whenever I choose $i$ democrats from Village A, I count all possible ways of choosing less than $i$ (i.e. $j$) number of democrats from Village B.

So I think the answer is,

$$P = \frac{\sum_{i=1}^{100} \left [ {0.49n \choose i} \times {0.51n \choose 100-i} \sum_{j=0}^{i-1} {0.51m \choose j} \times {0.49m \choose 100-j} \right ]}{{n \choose 100} \times {m \choose 100}}$$


So this question was asked in a MCQ style online test where my friend had less than 2 mins to solve it. So I am sure there is an elegant solution to this problem. I want to know what the probability is, and if possible how it changes for different percentage values (other than $49$, $51$), if we select an arbitrary number of people, say $x$ people from Village A and $y$ people from Village B.

Thanks.

2

There are 2 best solutions below

0
On BEST ANSWER

I think your solution is correct.

The following is not quite a 2 min solution... If both villages' populations are large you can do the following. Let $X$ and $Y$ be the number of democrats selected from villages A and B. These random variables respectively follow binomial distributions with $n=100$ and $p=.49$ and $.51$. These distributions can be approximated by normal distributions with means $100\times.49=49$ and $100\times.51=51$ and identical variances: $$100\times.49\times(1-.49)=100\times.51\times(1-.51)=24.99.$$

We shall use the fact that $X-Y$ is normally distributed with mean $49-51=-2$ and variance $24.99+24.99=49.98$. You are looking for $$ P(X>Y)=P(X-Y>0)=.39 $$ using the usual normal distribution computations or tables.

Note that without the normal approximation you can use the binomial distribution but the computations are much longer: $$ P(X>Y)= \sum_{x=0}^{100}\left(P(X=x)\times\sum_{y=0}^{x-1}P(Y=y)\right) $$ where $P(X=x)$ and $P(Y=y)$ are the usual binomial functions. Mathematica tells me the result is $.3617$ (the normal approximation is not too bad here).

Note that your computations yield probabilities of $0.3610$ when both villages have populations of 10,000 and $0.3546$ with 1,000.

Given that if both villages have a population of 100 the probability is $0$, I tend to guess that for populations larger than 100 the probability will be between 0 and $.3617$.

Btw, the 2 min solution is probably... "less than 50%" !

Changing the proportions of voters in each village or the number of voters selected is quite straightforward.

To get a sense of the impact of the changes plots are useful. Here I use the normal approximation (large villages) just because it is computationally easier:

Mathematica graphics

Getting a sense to the impact of village sizes may also be of interest. Here I use your (exact) formula, with both villages being of the same (variable) size. The horizontal line corresponds to the above value of $.36$

Mathematica graphics

0
On

I guess symmetry in this question helps for an easier solution. Let d(k) and r(k) denote number of Democrats and Republicans respectively from a random sample of 100 from village k. Say

X = Pr (d(A) > d(B)) = Pr (r(A) < r(B))
Y = Pr (d(A) < d(B)) = Pr (r(A) > r(B))
Z = Pr (d(A) = d(B))

Since probability of a Democrat in village A is equal to probability of a Republican in village B; and probability of a Republican in village B is equal to probability of a Democrat in village B, we can say by symmetry that X = Y.

X+Y+Z = 1
2X+Z = 1

Now it just suffices to find Z.

$ Z = \sum_{i=0}^{100} {100 \choose i} * (0.49)^i * (0.51)^{100-i}*{100 \choose i}* (0.51)^i * (0.49)^{100-i} \\ Z = (0.49)^{100} * (0.51)^{100} * \sum_{i=0}^{100} {100 \choose i} {100 \choose i} $

Using the fact that $ \sum_{i=0}^{n} {n \choose i} {n \choose i} = {2n \choose n} $

$ Z = {200 \choose 100} * (0.49)^{100} * (0.51)^{100} \\ X = 0.5*{(1- {200 \choose 100} * (0.49)^{100} * (0.51)^{100})} $