Recently I was going through Tom Mitchell's Machine Learning book and encountered a derivation of standard derivation from mean for a confidence interval problem.
Mean is defined as follows: $$\overline\delta = {1\over k} \sum_{i=1}^k \delta_i$$
And Standard deviation is as follows: $$\mathbf{ \sigma_{\overline\delta} } = \sqrt{{1 \over k(k-1)} \sum_{i=0}^k (\delta_i - \overline\delta)^2} $$
I cannot seem to understand why $ \require{color}\mathbf{\colorbox{yellow}{(k-1)}}$ term is present in the Standard deviation formula. I know this is trivial, but it's bothering me a lot.
Any help with this derivation will be much appreciated. Thank you in advance.
An unbiased estimate for the variance of the distribution generating the $\delta_i$'s is \begin{align*} S^2 \doteq \frac{1}{k-1}\sum_{i=1}^k (\delta_i - \bar{\delta})^2. \end{align*} Since the variance of the mean (assuming iid samples) is \begin{align*} \text{Var}(\bar{\delta}) = \frac{1}{k^2} \text{Var}\left( \sum_{i=1}^k \delta_i\right) = \frac{1}{k^2} \sum_{i=1}^k \text{Var}(\delta_i) = \frac{k\sigma^2}{k^2} = \frac{\sigma^2}{k}. \end{align*} Hence an unbiased estimate for the standard deviation of $\bar{\delta}$ is $\sqrt{\frac{S^2}{k}}$. Plugging in the definition of $S^2$ gives \begin{align*} \hat{\sigma}_{\bar{\delta}} = \sqrt{\frac{S^2}{k}} = \sqrt{\frac{1}{k(k-1)} \sum_{i=1}^k (\delta_i - \bar{\delta})^2} \end{align*}