Why is $\bar x(1-\bar x) + \frac{s^2}{n}$ an unbiased estimator of $\mu (1-\mu)$?

644 Views Asked by At

Let's consider a population of boolean values [0,1]. In the population, the mean (or frequency of 1) is $\mu$. We take a sample of size $n$, which mean $\bar x$ is

$$\bar x = \frac{\sum_i^n x_i}{n}$$

and sample variance

$$s^2 = \frac{\sum_i^n (x_i - \bar x)^2}{n-1}$$

I would like to estimate the parameter $D=\mu (1-\mu)$. It appears from the below small simulations (coded in R) that the unbiased estimator of $D$ is

$$\hat D = \bar x(1-\bar x) + \frac{s^2}{n}$$

Can you help me to figure out why this is true?


nbtrials = 5000
popSize = 200
pop = 0:1
sampleSize = 10 

out = numeric(nbtrials)
for (trial in 1:nbtrials)
{
    s = sample(pop,size=sampleSize, replace=TRUE) 
    xbar = sum(s) / sampleSize
    out[trial] = xbar * (1-xbar) + var(s) / sampleSize
}
xbar=sum(pop) / length(pop)
print(paste("True value of D = ",xbar *(1-xbar)))
print(paste("Average estimated value of D = ",mean(out)))
2

There are 2 best solutions below

0
On BEST ANSWER

Several facts you need to use: $$ E[X_1] = \mu, Var[X_1] = \mu(1 - \mu), E[\bar{X}] = E[X_1], Var[\bar{X}] = \frac {1} {n} Var[X_1], E[S^2] = Var[X_1]$$

Then we have $$ \begin{align} E[\hat{D}] &= E[\bar{X}] - E[\bar{X}^2]+\frac {1} {n} E[S^2] \\ &= \mu - Var[\bar{X}]-E[\bar{X}]^2 + \frac {1} {n}\mu(1 - \mu) \\ &= \mu - \frac {1} {n} \mu(1-\mu)-\mu^2+\frac {1} {n}\mu(1 - \mu) \\ & = \mu - \mu^2 \\ & = \mu(1 - \mu) \end{align}$$

Therefore this is an unbiased estimator of $\mu(1 - \mu)$

0
On

Note that $$E[\bar{x}]=\mu,$$ $$var(\bar{x})=\frac{\mu}{n}\left(1-\mu\right),$$ and $$E[\bar{x}^2]=var(\bar{x})+E[\bar{x}]^2 = \mu^2 + \frac{\mu}{n}\left(1-\mu\right).$$ Also, $$E[s^2]=var{(x_i)}=\mu\left(1-\mu\right).$$ Therfore, $$E[\hat{D}]=E[\bar{x}]-E[\bar{x}^2]+\frac{E[s^2]}{n}=\mu-\mu^2-\frac{\mu(1-\mu)}{n}+\frac{\mu\left(1-\mu\right)}{n} = \mu(1-\mu).$$

So $\hat{D}$ is an unbiased estimator of $\mu(1-\mu)$.

Incidentally, $s^2$ is also an unbiased estimator of $\mu(1-\mu)$.