Can someone explain how to find the t-distribution for estimating the difference in means of two normal populations when variance is not known?

108 Views Asked by At

See the question referenced. How do I find the t-part? If my confidence level is $95\%$ what do I do with that number? I know what to do to find the inverse distribution in other estimations but this does not make sense to me. It is the last part of the answers section.

I have googled and found out this is the t-distribution: $\frac{x-\mu}{s / \sqrt{n}}$

I insert the numbers from the answers but I get the wrong answer.

$$\frac{-10}{4.64/ \sqrt{100}}=-21.5517241379$$

This is not close at all to the answer. Assigment with answers

2

There are 2 best solutions below

1
On BEST ANSWER

Let $\overline X_i$ be the sample mean for the $i$th population, for $i=1,2.$ And let $$ S_i^2 = \frac 1 {n_i-1} \sum_{k=1}^{n_i} (X_{i,k} - \overline X_i)^2 $$ be the two sample variances. Then $$ \frac{(n_i-1)S_i^2}{\sigma^2} \sim \chi^2_{n_i-1} $$ and these two chi-square random variables are independent of each other. Thus we have $$ V = \frac{(n_1-1)S_1^2 + (n_2-1)S_2^2}{\sigma^2} \sim \chi^2_{n_1+n_2-2}. \tag 1 $$ Then you have $\overline X_1 - \overline X_2 \sim\operatorname N\left(0,\dfrac{\sigma^2}{n_1} + \dfrac{\sigma^2}{n_2}\right).$ So

$$ Z = \frac{\overline X_1 - \overline X_2}{\sigma\sqrt{\frac 1 {n_1} + \frac 1 {n_2}}} \sim \operatorname N(0,1). \tag 2 $$ Then recall that $(1)$ and $(2)$ are independent. (Short hint: $\operatorname{cov} (\overline X_1, X_{1,k} - \overline X_1) = 0.$)

So $\dfrac{Z}{\sqrt{V/(n_1+n_2-2)}}$ has a t-distribution (the $\sigma$ cancels out).

0
On

The answer you show seems to use the pooled 2-sample t test without any assurance that population variances differ (and in fact clues from sample standard deviations that population variances may be something like 25 and 16, respectively).

Minitab will accept summarized data as you have given. Of course, with summarized data we cannot verify normality.

Without agreeing that it is a correct approach and recognizing you are trying to get an answer that matches the pooled t approach, I ran your summary data in Minitab. The main point is that the output (below) shows a confidence interval, so you can match your CI with the one from Minitab. I suppose a 95% CI may be part of your desired answer.

Two-Sample T-Test and CI 

Sample   N    Mean  StDev  SE Mean
1       36  120.00   4.00     0.67
2       64  130.00   5.00     0.63

Difference = μ (1) - μ (2)
Estimate for difference:  -10.000
95% CI for difference:  (-11.930, -8.070)
T-Test of difference = 0 (vs ≠): 
  T-Value = -10.28  P-Value = 0.000  DF = 98
Both use Pooled StDev = 4.6675

It is not your fault that the answer book makes a mistake, so I'm not downvoting your Question. However---just to keep this answer respectable (and not misleading to others who may look at this page)---I show below output from a more appropriate Welch 2-sample t test that does not assume equal population variances.

There are two major differences: (a) A different standard error is used in the denominator of the T statistic. $[T=-19.28$ vs. correct $T=-10.94].\;$ (b) The degrees of freedom differ $[\nu = 98$ vs. $\nu=86.]$ Most of the difference between the two CIs $[\,(-11.9,-8.9)$ vs. $(-11.8,-8.2)\,]\,$ is due to (a).

Two-Sample T-Test and CI 

Sample   N    Mean  StDev  SE Mean
1       36  120.00   4.00     0.67
2       64  130.00   5.00     0.63


Difference = μ (1) - μ (2)
Estimate for difference:  -10.000
95% CI for difference:  (-11.817, -8.183)
T-Test of difference = 0 (vs ≠): 
 T-Value = -10.94  P-Value = 0.000  DF = 86