In a survey to estimate the proportion "p" of votes that a party will poll in an election, the voter list is divided into male and female lists.
A sample of 100 from each list by simple random sampling without replacement (SRSWOR) is taken.
It is observed that $x_1$ and $x_2$ respectively from the samples will vote for the party.
The estimate of p is taken as $p=\dfrac{x_1+x_2}{200}$.
Is this a biased estimate ? Answer given: Yes. An undivided voter list would give unbiased estimate.
I did not understand why this approach would lead to a biased estimate.
I know the statistical definition of unbiasedness (when mean of the sampling distribution of a statistic equals the corresponding population parameter.)
Any hints ?
Imagine for a moment that there are 10,000 male voters and 100,000 female voters. Then each male in the sample represents 100 actual voters and each female in the sample represents 1,000 actual voters. So the estimate for the total number of people who vote for this party ought to be $100 x_1 + 1,000x_2$ and the estimate for the percentage ought to be $$ \tilde p = \frac{100 x_1 + 1,000x_2}{110,000} = \frac{x_1 + 10x_2}{1,100} $$
which is unbiased. Using an undivided (unblocked) sample of 200 also gives an unbiased estimate with lower variance.
The estimate $\frac{x_1 + x_2}{200}$ is unbiased only if the the number of male and female voters is the same.