Why can I estimate the variance if I already estimated the mean in this way?

69 Views Asked by At

In my lectures it's discussed the estimation of $Var(X)$ in the case I have already estimated $E(X)$, but there is something I don't know if it's a problem or not.

Suppose $Var(X) = f(E(X))$ , for example in the case of a bernoulli r.v. we have $Var(X) = E(X)(1-E(X))$ , besides I have estimated $ E(X) $ to be $ \hat{\theta} $ then I could use as a estimate of $Var(X)$ the function $f(\hat{\theta})$ .

However, in this reasoning I think there is something not so clear about the error in the estimate because if the estimate of $E(X)$ is not exact (which is always the case) then the estimate of $Var(X)$ could be much more inaccurate depending on the actual function $f$.

Is this method used because it's assumed that the estimate $\hat{\theta}$ is equal to the actual value of $E(X)$? I don't have much knowledge in statistical inference so I don't know if this is a real problem. Is this a true problem?

2

There are 2 best solutions below

0
On BEST ANSWER

What @Ian said, is true, though not complete: for special distributions (like Bernoulli), there may be a sufficient statistic which (loosely speaking) contains all the information in the sample about the paramer(s). For a more formal discussion, see here: https://en.wikipedia.org/wiki/Sufficient_statistic#Bernoulli_distribution

So while in general, you'd estimate a variance by the sample variance, you'd better use a function of the sufficient statistic in this special case. Now $\hat{\theta}(1-\hat{\theta})$ would be biased, because an easy calculation shows $\displaystyle\mathbb{E}\hat{\theta}(1-\hat{\theta})=\frac{n-1}n\,p(1-p)$, but the remedy is obvious: use $\displaystyle\frac{n}{n-1}\,\hat{\theta}(1-\hat{\theta})$ as an estimator, then.

0
On

If you have an estimator $\hat\theta$ for a parameter $\theta$ and you are interested in estimating $h(\theta)$ for some function $h$, then it is legitimate to propose $h(\hat\theta)$ as an estimator for $h(\theta)$. You are correct that if there is uncertainty in $\hat\theta$, then that uncertainty will propagate to $h(\hat\theta)$. A simple way to assess the uncertainty is to expand $h(\hat\theta)$ in a one-term Taylor approximation around $\theta$: $$ h(\hat\theta) \approx h(\theta) + h'(\theta)(\hat\theta-\theta).\tag1$$ This gives a rough measure of the variance for $h(\hat\theta)$, namely: $$ \operatorname{Var}(h(\hat\theta))\approx [h'(\theta)]^2 \operatorname{Var}(\hat\theta).\tag2 $$ You can interpret (2) as an approximate relation between the variance of $\hat\theta$ and the variance of $h(\hat\theta)$ .

Taking expectations in (1) also gives a rough measure of the expectation of $h(\hat\theta)$. This first-order approximation to the mean and variance of $h(\hat\theta)$ is often called the delta method. An example of its use, applied to estimation of $p(1-p)$ in the Bernoulli case, is here.