Standard Deviation: Population versus Sample (specific example)

365 Views Asked by At

So, I'm trying to use a t-test to test a hypothesis regarding information: My students were given a question in which they chose either $1, 2, 3, 4$, or $5$ to determine how much they enjoyed my student teacher. I have two different classes with $2$ different sets of data and I want to compare the classes. Unfortunately, not all of my students chose to turn in the ranking. Out of my class of $33$, only $25$ turned it in and out of my class of $35$, only $14$ turned it in. When computing the values for my t-statistic and the degrees of freedom, do I use a population standard deviation or a sample standard deviation? I was always a bit confused, but would really like an answer specific to my situation.

1

There are 1 best solutions below

0
On

Sample vs. population standard deviation. I'm not exactly sure what you mean by sample vs. population SD. It is unusual to know the actual population standard deviation $\sigma,$ but if you do know it, then by all means use it. Why would you use the sample SD to estimate something you already know. (If you knew the population SD, you would use a normally distributed test statistic, not a t test.)

However, if you are asking whether you should divide the sum of the squared deviations by $n$ or by $n - 1$, then you should use the latter. Standard formulas, as for example the formula for a t-statistic are derived using $n-1.$

Pooled t test. For the pooled 2-sample t test one assumes (often without basis) that the two populations have the same SD (or variance). In that case, one computes a 'pooled' estimate of the common variance:

$$s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2},$$

where $n_i$ is the size of sample $i = 1,2,$ and $s_i^2$ are their sample variances (denominators $n_i -1$). Take the square root to get the pooled SD.

In this case the degrees of freedom (DF) for the t statistic are $n_1 - n_2 - 2.$ (The t statistic has the t distribution if the null hypothesis of equal means is true.) The test statistic is $$T = \frac{\bar X_1 - \bar X_2}{s_p\sqrt{1/n_1 + 1/n_2}}.$$

[Here the distribution theory is exact. The derivation of the distribution theory works fine when there is only one variance estimator involved, and we created that situation (perhaps fancifully) my assuming a common variance to estimate.]

Welch t test. The Welch or 'separate-variances' t test uses the test statistic

$$T^\prime = \frac{\bar X_1 - \bar X_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}.$$

When the null hypothesis is true, $T^\prime$ has approximately a t distribution with a rather complicated formula for DF. Again here, the $s_i^2$ are computed with $n_i - 1$ in denominators.

There is good theory behind the approximation, but most of what is known for practical applications is based on extensive simulation studies. The consensus is that one should always use the Welch test. It performs much better when population variances are not equal and essentially no worse when they are. (Most statistical software does the Welch test by default, and the pooled test only if one overrides the default.)

Notes: (1) I mention the Welch test here in particular because you said your sample sizes are unequal, and for such 'unbalanced' cases, it is especially important to use the Welch test.

(2) In the situation you describe, there may be 'selection bias', because not everyone participated. Is is fair to assume nonparticipants would have given the same rankings on average as those who participated?

(3) If your 'rankings' are highly discrete and far from normally distributed, t tests might not be appropriate. In that case a permutation test would probably be best. (Traditional nonparametric tests break down when there are more than a few tied observations.)