Test whether we can infer that the population means differ

105 Views Asked by At

enter image description here

Naturally, this is very insignificant and we fail to reject the null hypothesis. Is this the right calculation?

2

There are 2 best solutions below

6
On

You mistakenly took ${\left(s_1^2\right)}^2$ and ${\left(s_2^2\right)}^2$ which gave you a much larger unpooled standard deviation than what it actually is.

For unequal variances, we should use Welch's Test.

Let $$H_0 : \mu_1 = \mu_2$$

$$H_a : \mu_1 \neq \mu_2$$

We have

$$\frac{\bar{x_1}-\bar{x_2}}{\sqrt{\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}}}\sim t_v$$

where

$$\begin{align*} v=\frac{\left(\frac{s_1^2}{n_1}+\frac{s_2^2}{n_2}\right)^2}{\frac{1}{n_1-1}\left(\frac{s_1^2}{n_1}\right)^2+\frac{1}{n_2-1}\left(\frac{s_2^2}{n_2}\right)^2} &=\frac{\left(\frac{729}{81}+\frac{350}{50}\right)^2}{\frac{1}{81-1}\left(\frac{729}{81}\right)^2+\frac{1}{50-1}\left(\frac{350}{50}\right)^2}\\ &= 127 \end{align*}$$

Thus,

$$\frac{29-23}{\sqrt{\frac{729}{81}+\frac{350}{50}}}=1.5\sim t_{127}$$

But $t_{127,.025}\approx 1.97$

so we fail to reject the null hypothesis that the means differ.

Note that the p-value associated with $t_{127,.025}\approx 1.97$ is $0.136$ so we are closer to rejection than your calculations.

If we use the given value of $df=120$ we should come to a similar conclusion:

$$\frac{29-23}{\sqrt{\frac{729}{81}+\frac{350}{50}}}=1.5\sim t_{120}$$

and $t_{120,.025}\approx 1.9799$ so we fail to reject in this case as well.

1
On

The sample means $\bar X_1 = 29$ and $\bar X_2 = 23$ differ. The question is whether this difference in sample means is good evidence of a difference in population means.

Because of sampling error, sample mean may not be entirely reliable estimates of population means. The degree of reliability depends on population variability and sample size.

It is appropriate to use the Welch two-sample t test unless one has background information (before seeing the current data) that population variances may differ. If population variances are not the same, the traditional pooled t statistic may not have a t distribution. The discrepancy may be especially severe if (as here) sample sizes differ and the the smaller sample comes from what may be the population with the larger variance.

The formula in @Remy's answer is correct. In order to check computations I put the summary statistics (sample sizes, means, and SDs) into Minitab software, with the results shown below:

Two-Sample T-Test and CI 

Sample   N  Mean  StDev  SE Mean
1       81  29.0   27.0      3.0
2       50  23.0   18.7      2.6

Difference = μ (1) - μ (2)
Estimate for difference:  6.00
95% CI for difference:  (-1.91, 13.91)
T-Test of difference = 0 (vs ≠): 
   T-Value = 1.50  P-Value = 0.136  DF = 127

The P-value of the test is 0.136, which is considerable greater than 5%, so we do not have evidence to reject $H_0: \mu_1 = \mu_2$ against its two-sided alternative.

The critical value for a two-sided test at the 5% level for DF = 127 is 1.9788. Because $|T| = 1.50 < 1.9788,$ we cannot reject at the 5% level. [For DF 120, suggested in the problem, the critical value is 1.9799, which leads to the same conclusion.]

Another indication that there is no significant difference between population means is that the 95% CI for $\mu_1 - \mu_2$ (in the Minitab printout) contains $0.$ Thus 'no difference in population means' is a reasonable conclusion.

Finally, as a rough rule of thumb for a two-sample t test with both sample sizes above 15, one can say that sample means must differ by more than about three standard errors in order for the difference to be significant. Here, the standard errors are $S_1/\sqrt{81} \approx 3$ and $S_2/\sqrt{50} \approx 2.6,$ whereas the absolute difference between sample means is only $|29 - 23| = 6.$ So it is not a surprise that the precise computation of the Welch test does not lead to rejection of $H_0.$