I understand that given a sample of data, the following formula can be used to estimate the population variance.
$\displaystyle S^{2} = \frac{1}{n-1}\sum_{i=1}^{n}(X_{i}-\overline{X})^{2}$
However, I was asked this question. When a sample is taken from the population, does the sample variance follow that of the population? Is it still more accurate to use $\frac{1}{n-1}$ or we should use $\frac{1}{n}$ instead?
Edit 1: Sorry if the question wasnt clear enough. What I meant is that if we try to find out the sample variance (a random sample from the population) given the population variance, do we take it as such or do we have to adjust it using $\frac{n}{n-1}$?
Thank you.
If $X_1, X_2, \dots, X_n$ is a random sample from a population with variance $\sigma^2,$ then the sample variance $S^2 = \frac{1}{n-1}\sum_{i=1}^n (X_i- \bar X)^2$ has $E(S^2) = \sigma^2,$ so that $S^2$ is an unbiased estimate of $\sigma^2.$ If $n$ is used in the denominator instead of $n - 1$, then $S^2$ is biased, but the bias decreases with increasing $n.$
Furthermore, if the population is $\mathsf{Norm}(\mu,\sigma),$ then $\frac{(n-1)S^2}{\sigma^2} \sim \mathsf{Chisq}(df = n-1).$
Thus, if $L$ and $U$ are chosen (from software or printed tables of the chi-squared distribution) so that $P\left(L \le \frac{(n-1)S^2}{\sigma^2} \le U\right) = 0.95,$ then (after manipulation of the inequality) a 95% confidence interval for $\sigma^2$ is of the form $\left(\frac{(n-1)S^2}{U}, \frac{(n-1)S^2}{L}\right).$
[Note: Technically, there are various ways to define the 'accuracy' of such an estimate, and unbiasedness is not the only criterion for accuracy. It has been argued that according to some criteria for accuracy and for some population distributions, it might be better to use $n$ or, even $n+1,$ in the denominator. However, I'm guessing these are more advanced considerations than you have in mind.]