Accuracy of estimation of the variance

90 Views Asked by At

Consider $N$ i.i.d. samples $x_1, \dots, x_N$ from an unknown discrete distribution on $\{0,\dots, n\}$. We know that

$$\frac{1}{N-1} \sum_{i = 1}^N (x_i - m)^2$$

where $m$ is the sample mean, is an unbiased estimator for the variance.

But what confidence intervals can we give for this estimate?

2

There are 2 best solutions below

2
On

If the $X_i$ are i.i.d. normally distributed $N(\mu,\sigma^2)$ with $s^2=\frac{1}{N-1} \sum\limits_{i = 1}^N (x_i - \bar{x})^2$ then $$\frac{(N-1)s^2}{\sigma^2} \sim \chi^2_{N-1}$$

so, for example, if you want a $95\%$ confidence interval for the variance then you could use something like $$\left[ \frac{(N-1)s^2}{\chi^2_{0.025,N-1}}, \frac{(N-1)s^2}{\chi^2_{0.975,N-1}} \right]$$

You may need other approaches if the $X_i$ are not normally distributed; if your underlying distribution was Binomial with known $n$ and unknown $p$, the normal approximation may be good enough with large $N$, or you could take a more specific approach

11
On

You may use the asymptotic approximation. Specifically, there is enough structure to invoke the CLT, i.e., letting $S_n^2$ denote the sample variance ($n$ here denotes the sample size), \begin{align} \sqrt{n}(S_n^2-\sigma^2)&=\frac{1}{\sqrt{n}}\sum_{i=1}^n\left[(X_i-\mathsf{E}X_i)^2-\sigma^2\right]+o_p(1)\\ &\xrightarrow{d}N(0,\sigma^2(\kappa-1)), \end{align} where $\kappa:=\mu_4/\sigma^4$, $\mu_k$ is the $k$-th central moment of $X_1$, and $\sigma^2=\operatorname{Var}(X_1)$. Therefore, the asymptotic confidence interval for $\sigma^2$ (at nominal level $\alpha$) is of the form $$ S_n^2\pm z_{1-\frac{\alpha}{2}}S_n^2\sqrt{(K_n-1)/n}, $$ where $K_n$ is the sample kurtosis.