I'm reading Larson Introduction to Probability Theory and Statistical Inference where he introduces the $T$-test.
The derivation is rather complicated and it seems to me we could just base our test on $\bar{X}-\bar{Y}$ because under the assumption of the null hypothesis that all $X$ and $Y$ observations have the same mean and same variance, then $\bar{X}-\bar{Y}$ is normal with mean zero and easily calculable variance. So we can easily compute a cutoff of $|\bar{X}-\bar{Y}|$ that would give us the desired Type I error.
Yet Larson proceeds by finding the test via the likelihood ratio test which ultimately leads to the $T$-statistic.
I assume the $T$-statistic approach has better properties. But shouldn't the textbooks outline what those properties are, because otherwise they're just asking us to proceed on faith and that's not how we should do math.
So if anybody can give me some insight into what the advantages are of using $T$ versus just using $\bar{X}-\bar{Y}$ and how we determine that, I would be greatly appreciative.
Thank you!
When you have to estimate $\sigma$ based on the sample data, there is some error in the estimate.
When you make predictions on the distubtion of $X$ if you assume normality with an accurate measurement of $\sigma$ you will predict results in too narrow of a band.
The t-distribution accounts for this mis-estimation in $\sigma$ and predicts a fatter tailed distribution.