How to rigorously justify using a T-test rather than just $\bar{X}-\bar{Y}$?

77 Views Asked by At

I'm reading Larson Introduction to Probability Theory and Statistical Inference where he introduces the $T$-test.

The derivation is rather complicated and it seems to me we could just base our test on $\bar{X}-\bar{Y}$ because under the assumption of the null hypothesis that all $X$ and $Y$ observations have the same mean and same variance, then $\bar{X}-\bar{Y}$ is normal with mean zero and easily calculable variance. So we can easily compute a cutoff of $|\bar{X}-\bar{Y}|$ that would give us the desired Type I error.

Yet Larson proceeds by finding the test via the likelihood ratio test which ultimately leads to the $T$-statistic.

I assume the $T$-statistic approach has better properties. But shouldn't the textbooks outline what those properties are, because otherwise they're just asking us to proceed on faith and that's not how we should do math.

So if anybody can give me some insight into what the advantages are of using $T$ versus just using $\bar{X}-\bar{Y}$ and how we determine that, I would be greatly appreciative.

Thank you!

2

There are 2 best solutions below

1
On BEST ANSWER

When you have to estimate $\sigma$ based on the sample data, there is some error in the estimate.

When you make predictions on the distubtion of $X$ if you assume normality with an accurate measurement of $\sigma$ you will predict results in too narrow of a band.

The t-distribution accounts for this mis-estimation in $\sigma$ and predicts a fatter tailed distribution.

0
On

Assume, for the sake of simplicity, that $n_Y = n_X = n$ and $\sigma^2_Y = \sigma^2_X = \sigma^2$ and $H_0: \mu_X = \mu_Y$ against two sided alternative, $H_1: \mu_X \neq \mu_Y$. As in real world cases you don't know the values of any of the parameters. However, let us assume that you know that both $X$ and $Y$ are normally distributed. As such, clearly under $H_0$ you have that $$ \bar{X}_n - \bar{Y}_n \sim N(0, 2\sigma^2/n). $$
You are interested in calculating the probability of getting the sample difference $\hat{\delta}_n = \bar{X}_n - \bar{Y}_n$ under $H_0$, and if such a difference is very unlikely then you will reject $H_0$. So, you have to construct a decision rule, e.g., $$ \varphi(X,Y)=I(|\bar{X_n}-\bar{Y_n}|\ge c). $$ So, you are basically searching for $c$ that allows you to build a test of size $\alpha$ (it is the probability of mistakenly rejecting the $H_0$). As such, you interested in constructing $c$ which will be a function of the sample (and other known constants), and, if possible, to find some general procedure for such problems. $$ \alpha = E_{H_0}I(|\bar{X_n}-\bar{Y_n}|\ge c)=P(\bar{X_n}-\bar{Y_n}\ge c)+P(\bar{X_n}-\bar{Y_n}\le -c) = 2P(\bar{X_n}-\bar{Y_n}\ge c), $$ Now, you are interested in find a statistic which distribution will not depend on unknown parameters, such a distribution arises from the basic scaling procedure of subtracting the expected value, $\delta_0 =0$, and dividing by the estimated standard deviation, i.e., $$ 2P(\bar{X_n}-\bar{Y_n}\ge c)=2-2P\left(\frac{\sqrt{2n}(\bar{X_n}-\bar{Y_n})}{S}<\frac{\sqrt{2n}c}{S}\right) . $$ Now, recall that the density function of $$ T = \frac{N(0,1)}{\sqrt{\chi^2_{(n)}/n}}, $$ is what we call the $t$ distribution with $n$ degrees of freedom. Thus, $$ \alpha=2-2F_{t(2n-2)}\left( \frac{\sqrt{2n}c}{S}\right) , $$ hence $$ c = \frac{S}{\sqrt{2n}}t^{(1-\alpha/2)}_{(2n-2)} \, . $$ Hence,

  1. $t$ distribution arises "naturally" as the distribution of the scaled random variable.
  2. $t$ do not depend on the unknown parameters ($\sigma^2$), hence allows you to construct exact CI and rejection regions based on the sample values only,
  3. It indeed posses "fatter" tails w.r.t $N(0,1)$, hence incorporates the additional uncertainty induced by the estimated $\sigma^2$.