Unbiased estimator for $p^2$. Bernoulli distribution.

3.4k Views Asked by At

Let $X_{1},...,X_{n}$ be a random sample from Bernoulli (p), find an unbiased estimator for $p^{2}$.

I think It's the same estimator for $\mathrm{Bin}(n,p)$ so:

$${V}(X) = np(1-p) = np - np^2 \\ p^2 =\frac{1}{n}(np - {V}(X)) =\frac{1}{n}({E}[X] - {V}(X)) \\ \hat{p}^2=\frac{1}{n} \left(\hat{p} - \frac{\hat{p}(1-\hat{p})}{n} \right) $$

$\hat{p}^2$ is unbiased because a linear combination of unbiased estimators for two quantities will be an unbiased estimator for the same linear combination of these quantities

Is correct?

2

There are 2 best solutions below

0
On

Your notation is confusing. $\hat p^2$ could either mean $(\hat p)^2$, or it could mean $\widehat{p^2}$. You also don't define $\hat p$ itself.

The question asks for an unbiased estimator of $p^2$. Naturally, an unbiased estimator of $p$ is $$\hat p = \bar X = \frac{1}{n}\sum_{i=1}^n X_i,$$ the sample mean of observations. We can confirm this by computing $$\operatorname{E}[\hat p] = \operatorname{E}\left[\frac{1}{n} \sum_{i=1}^n X_i\right] = \frac{1}{n} \sum_{i=1}^n \operatorname{E}[X_i] = \frac{1}{n} \sum_{i=1}^n p = \frac{1}{n} \cdot np = p.$$
What if we simply took as our estimator for $p^2$ $$(\hat p)^2 = (\bar X)^2 = \left(\frac{1}{n} \sum_{i=1}^n X_i\right)^2?$$ What is the expectation of this value? Well, there are a few ways we can compute it. The naive way is to perform the expansion; i.e. $$\operatorname{E}\left[\left(\frac{1}{n} \sum_{i=1}^n X_i \right)^2\right] = \frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n \operatorname{E}[X_i X_j].$$ When $i \ne j$, $\operatorname{E}[X_i X_j] = \operatorname{E}[X_i]\operatorname{E}[X_j]$ because they are independent; but when $X_i = X_j$, we have $$X_i X_j = X_i^2,$$ hence $$\operatorname{E}[X_i^2] = \operatorname{Var}[X_i] + \operatorname{E}[X_i]^2 = p(1-p) + p^2 = p.$$ Since in our double sum we have $n$ instances where $i = j$ and $n^2$ terms in total, it follows that $$(\hat p)^2 = \frac{1}{n^2} \left( (n^2-n) p^2 + n p\right) = \frac{p(1-p)}{n} + p^2.$$ So the bias here is $p(1-p)/n$. To rid ourselves of it, we should collect like terms in $p$ to get $$(\hat p)^2 = \frac{p}{n} + \frac{n-1}{n} p^2,$$ from which we see from linearity of expectation that $$(\hat p)^2 - \frac{\hat p}{n} = \frac{n-1}{n} p^2.$$ Notice while it looks like we simply replaced $p$ with $\hat p$ and rearranged the equation, this is not what we actually did. What actually happened is $$\operatorname{E}\left[(\hat p)^2 - \frac{\hat p}{n}\right] = \operatorname{E}[(\hat p)^2] - \frac{1}{n}\operatorname{E}[\hat p] = \left(\frac{p}{n} + \frac{n-1}{n} p^2\right) - \left(\frac{p}{n}\right) = \frac{n-1}{n} p^2.$$ Therefore, our unbiased estimator should be $$\widehat{p^2} = \frac{n}{n-1}\left((\hat p)^2 - \frac{\hat p}{n}\right).$$

0
On

Interesting, buy my mistake was to take the Bernoulli as Binomial distribution, absolutely wrong. We’ll show that $p^2$ is not estimable. Suppose there was a function g of X that satisfied the equation

$E_pg(X) = p^2$ for all possible $p ∈ (0,1)$

To hold for all p, one must have

$g(1)p+g(0)(1− p) = p^2$ for all $p ∈ (0,1)$

If this equation holds for all p, then we would have a function $f$, namely

$f(p) = p^2 +(g(0)−g(1))p−g(0)$,

that is quadratic in the variable $p$ and is equal to zero for all $p$ in the interval (0,1). But there can be no such quadratic function, as quadratic functions (whose graphs are parabolic) can have at most two real roots, that is, can have at most two values of $p$ for which $f(p)=0$. It follows that a function $g$ satisfying the property $E_pg(X) = p^2$ does not exist.