Question about standard error of sample standard deviation

597 Views Asked by At

I have a question regarding one formula (3.8) on page 84 of the sixth edition of financial risk manager handbook. For convenience, I will rephrase the problem as below.

Define $X$ as the random variable of interest. We observe a sequence of $T$ realized values for $x$, $x_{1},x_{2},\ldots, x_{T}$. The sample mean is \begin{equation*} m=\hat\mu = \frac{1}{T}\sum_{i=1}^{T}x_{i} \end{equation*} The sample variance is defined as \begin{equation*} s^{2}=\hat\sigma^{2} = \frac{1}{T-1}\sum_{i=1}^{T}(x_{i}-\hat\mu)^{2} \end{equation*} Suppose $x_{1},x_{2},\ldots, x_{T}$ have a normal distribution with mean $\mu$ and variance $\sigma^{2}$, it has been proved that \begin{equation*} (T-1)\hat\sigma^{2}/\sigma^{2} \sim \chi^{2}(T-1) \end{equation*} It has also been proved that if the sample size $T$ is large enough, the chi-square distribution converges to a normal distribution, then: \begin{equation} \hat\sigma^{2} \sim N(\sigma^{2},\sigma^{4}\frac{2}{T-1}) \end{equation} The central limit theorem states that this distribution is only valid asymptotically. How come the result that, approximately, the sample standard deviation $\hat\sigma$ has a normal distribution with a standard error of \begin{equation} se(\hat\sigma)=\sigma \sqrt {\frac{1}{2T}} \end{equation} ? By definition of standard error, $se(\hat\sigma)=\sqrt{E[\hat \sigma^{2}-E(\hat\sigma^{2})]^{2}}=\sqrt{E[\hat\sigma^{2}]-(E[\hat\sigma])^2}=0$ I am wondering how to get the standard error $\sigma\sqrt{\frac{1}{2T}}$? Thanks!

2

There are 2 best solutions below

3
On BEST ANSWER

I think you may be confusing two kinds of estimation:

$1.$ For estimating $\mu,$ the estimator is $\bar X = \frac{1}{n}\sum_i X_i.$ Then the standard error (SE) of $\bar X$ is $SD(\bar X) = \sigma/\sqrt{n}.$ When $\sigma$ is unknown, estimated by $S$, the (estimated) SD is $S/\sqrt{n}.$ Because the case with $\sigma$ unknown is the most common in practice, the word estimated is often dropped out of laziness. (I have used $n$ instead of your $T$ in order to avoid confusion with notation in the next paragraph.)

To make a 95% CI for $\mu$, the usual procedure is to observe that the statistic $T = \frac{\bar X - \mu}{S/\sqrt{n}} \sim \mathsf{T}(df=n-1),$ so that one can find constants $L$ and $U$ with $P(L < T < U) = 0.95.$ With simple manipulation of inequalities, one finds a CI of the form $\bar X \pm t^*S/\sqrt{n},$ where it is customary to use $t^* = -L = U$ by symmetry, where $t^*$ cuts 2.5% from each tail of $\mathsf{T}(df=n-1).$ Roughly speaking, one says that the CI is the point estimate $\bar X$ of $\mu$ plus or minus a 'probability factor' times the standard error.

$2.$ For estimation $\sigma^2$, as you say, one uses the fact that $Q = \frac{(n-1)S^2}{\sigma^2} \sim \mathsf{Chisq}(df = n-1).$ Thus to make a 95% CI for $\sigma,$ one can find $L$ and $U$ (different from above) so that $P(L < Q < U) = 0.95$ and the CI is of the form $\left(\frac{(n-1)S^2}{U}, \frac{(n-1)S^2}{L}\right).$ Because the chi-squared distribution is not symmetrical, $L>0$ and $U>L$ have different numerical values. Often one chooses $L$ to cut 2.5% of the probability from the lower tail of $\mathsf{Chisq}(df = n-1)$ and $U$ to cut 2.5% from its upper tail. (This is sometimes called a 'probability-symmetric' CI; generally, not the shortest possible CI, but a convenient choice. (To find a CI for $\sigma,$ take square roots of the endpoints of the CI for $\sigma^2.$)

In practical applications, one does not often speak of the SE of the estimator $S^2$ of $\sigma^2$ or $S$ of $\sigma$ because SE does not play an explicit role in finding confidence limits. Because $S^2$ is an unbiased MLE of $\sigma^2$ and the usual "regularity conditions" are satisfied, $S^2$ has the smallest variance among unbiased estimators of $\sigma^2$ and that implies that CIs based on $S^2$ tend to be the 'best' ones available, and applied discussions often end there.

It is true that $S^2 = \widehat{\sigma^2}$ is asymptotically normal, so that for very large $n$, CIs for $\sigma^2$ become nearly symmetrical, but the form of CI I showed above is the one in standard use.

Addendum (per Comments): The terminology 'standard error' is typically used to mean the standard deviation of an estimator. I am not sure what you are estimating. For a sample of size $n$ from $\mathsf{Norm}(\mu, \sigma),$ one can manipulate the gamma distribution of $S_n^2$ to show that $E(S_n) = \sigma\sqrt{\frac{2}{n-1}}\Gamma\left(\frac{n}{2}\right)/\Gamma\left(\frac{n-1}{2}\right).$ Expressing the $\Gamma$-function in terms of factorials by Stirling's approximation, one can show that $E(S_n) \rightarrow \sigma.$ I suppose you could use $E(S_n)$ to find $Var(S_n)$ (since you know $E(S_n^2)$) and hence $SD(S_n),$ if that is what you want. Frankly, I'm not sure whether you mean $SE(\hat \sigma)$ to be $SD(S_n)$ or whether either is approximately equal to $\sigma\sqrt{.5/n}.$

A simulation in R statistical software of a million sample standard deviations $S_n$ each with $n = 50$ and $\sigma = 1$ shows $SD(S_n) \approx \sigma\sqrt{\frac{1}{2n}} = 0.1,$ which might be promising.

m = 10^6;  n = 50
s = replicate(m, sd(rnorm(n)))
sd(s)
## 0.100835  # aprx 0.1 as anticipated

However, a histogram of simulated values $S_n^2$ is not consistent with the density function of $\mathsf{Norm}(1, \sqrt{2/49})$ (second argument is SD). So $n = 50$ is not nearly a large enough sample size for asymptotic normality of $S^2.$

enter image description here

Perhaps there are enough clues between these comments and those of @Yves (+1) for you or someone else to resolve this to your satisfaction.

Also, some of this is discussed in this link.

5
On

For $x_i$ normally distributed, there is no need to invoke the central limit theorem.

The statistical distribution of the estimator of the variance $\hat\sigma^2$ is indeed a $\chi^2$ law with $T-1$ degrees of freedom (to a factor $\sigma^2/(T-1)$), i.e. a sum of squared normal variables. Then the expectation of this estimator is is computed exactly from the $\chi^2$ distribution as the unbiaised value

$$\mu_{\hat\sigma^2}=\sigma^2$$

and the variance,

$$\sigma^2_{\hat\sigma^2}=\frac{2\sigma^4}{T-1}.$$