Relationship between population variance and sample variance as estimate of population variance

3.1k Views Asked by At

I am trying to make the link between the population variance, $\sigma^{2}$ and the sample variance $s^{2}$ with a particular example.

Say I have a population of five elements $\{0,1,2,3,4\}$, so $N = 5$. I was initially asked to calculate the population variance, and $V(\bar{y})$ for the sample mean, using samples of size $n = 2$. I did all of this, I'll provide the necessary results after this paragraph. The second part of the question asked show numerically that

$$E(s^{2}) = \frac{N}{N-1} \sigma^{2} $$

So now my curiosity kicked in. I know that the above result is a bias estimator of the population variance, so in order to make it unbias, I would have to multiply $s^{2}$ by $\frac{N-1}{N}$. So I attempted to do this with one calculation of $s^{2}$ from the set of samples of size $n = 2$ hoping to get the population variance I calculated earlier, but the calculation was not near what the population variance was. Using the following explicit values:

$\sigma^{2} = 2$

the sample element I used was $\{0,1\}$,

which gave me the following estimates:

$\bar{y} = 0.5$

which gave me a value sample variance of $s^{2} = (0-0.5)^{2} + (1-0.5)^{2} = 0.5$

Thus: $\frac{N-1}{N}s^{2} = \frac{4}{5}(0.5) = 0.4$

I thought this would equal the "theoretical" variance regardless of what value of $s^{2}$ was obtained. Am I interpreting that wrong? or is it because since $s^{2}$ is only an "estimation" of what the population variance is, then this estimator may very well not be exactly the value of the population parameter. Other values I used did fall closer to the population variance, but I was under the idea that making the estimator unbias it would always be equal to the population parameter value.

2

There are 2 best solutions below

1
On BEST ANSWER

Expected value of an estimator should be equal to the "theoretical" variance (in the case of unbiased estimator). Particular numerical result may differ. In fact, in many applications, "theoretical" variance is not known at all.

0
On

The summary of my recent published paper is given below for your immediate reference. The present paper explores empirically the property of unbiasedness of sample variance. For the study purposes, three finite populations termed as P1, P2 and P3 of size 7 were considered. For each population, all the possible distinct samples of size 2, 3, 4, 5 and 6 were generated. Thus, accordingly, the number of samples generated were 21, 35, 35, 21 and 7, respectively. In total for each population, 119 samples were generated. For each population and for each sample size, three statistics were calculated namely mean, V(n-1), V(n) where V(n-1) and V(n) is the sum of squared deviations, divided by (n-1) and n, respectively. Against the popular claim, V(n-1) is not found to be an unbiased estimate of the Population Variance denoted as V(N). Further, it was found to be consistently overestimating the V(N) by 17% approximately. In case of V(n), it was found to be underestimating the population variance by 19%. Based on the study results, it is suggested that the preference of using V(n-1) over V(n) should be assessed critically and may be dropped from further use. Consequence to this, and in view of my previous study, the use of t-test for small samples should be viewed critically, again and may be the use of Z-test in place of t-test to be encouraged, hereafter Ramnath Takiar (2022): SAMPLE VARIANCE - IS IT RALLY AN UNBAISED ESTIMATE OF THE POPULATION VARIANCE? - Bulletin of Mathematics and Statistics, Vol. 10(1), 21-30