What's the relationship between t test and t distribution?

132 Views Asked by At

This question always confuses me. I know the Student's t distribution is defined as follows: let $Z \sim N(0,1)$ and $V \sim \chi^2(v)$. If Z and V are independent, then the distribution of

$$T=\frac{Z}{\sqrt{V/v}}$$ has the Student's t distribution with v degrees of freedom.

I also know parametric method t-test. Assume $F=N(\mu, \sigma^2)$, $X_1, X_2,...,X_n \sim N(\mu, \sigma^2)$ , iid. Test statistic:

$$T= \frac{\bar{X}-\mu}{S/\sqrt{n}} \sim t_{n-1}$$

What's the relationship between the two above?

1

There are 1 best solutions below

0
On

The issue is making the connection between the test statistic

$$T_{\text{test}}=\frac{\bar{X}\,-\,\mu}{S/\sqrt{n}}$$

to assess the deviation of the sample mean from the population mean when the population variance $\sigma$ is not known (in real life situations), and the t-distribution:

$$T_{\text{test}}=\frac{\bar{X}\,-\,\mu}{S/\sqrt{n}}\sim t_{n-1}$$

Perhaps the following simple tricks could help see it a bit more intuitively. Starting off from the intended statistic:

\begin{align} \frac{\bar{X}\,-\,\mu}{\frac{S}{\sqrt{n}}} &= \frac{\bar{X}\,-\,\mu}{\frac{\sigma}{\sqrt{n}}} \frac{1}{\frac{S}{\sigma}}\\[2ex] &\sim Z\,\frac{1}{\frac{S}{\sigma}}\\[2ex] &\sim\frac{Z}{\sqrt{\frac{\color{blue}{\sum_{i=1}^n(X_i-\bar X)^2}}{\,\color{blue}{\sigma^2}}\frac 1{(n-1)}}}\\[2ex] &\sim\frac{Z}{\sqrt{\frac{\color{blue}{\chi_{n-1}^2}}{n-1}}} \end{align}

where $\small X_1,\dots,X_n$ are iid $X_i \sim N(\mu,\sigma^2),$ and $S=\small \frac{\displaystyle\sum_{i=1}^n(X_i-\bar X)^2}{n-1}.$

The question then boils down to getting an idea of the step in blue above:

$$\small \frac{\displaystyle\sum_{i=1}^n(X_i-\bar X)^2}{\sigma^2}\sim \chi_{n-1}^2$$

From Wikipedia, the chi-square distribution with $k$ degrees of freedom is the distribution of a sum of the squares of $k$ independent standard normal random variables.

However, and also from Wikipedia in here, the iid normally distributed random variables from which the sample is obtained geometrically form a random vector $\small \begin{bmatrix}X_1,\dots,X_n \end{bmatrix}^\top,$ and likewise the error from the mean of these iid normal rv's, $\small X_i-\bar X,$ i.e. $$\small\begin{bmatrix} X_1 - \bar X\\ \vdots\\ X_n-\bar X \end{bmatrix}$$

can be seen as its projection on an $(n-1)$-dimensional subspace of $\mathbb R^n$ orthogonal to the span of the vector $\small \begin{bmatrix}1,\dots,1 \end{bmatrix}^\top$

enter image description here

which will result in a predictable loss of one degree of freedom. The $n-1$ degrees of freedom also can be seen from the fact that $\sum_{i=1}^n (X_i-\bar X)=0.$

And the squared norm of this vector will logically follow a $\chi^2_{n-1}$ scaled by a $\sigma ^2$ factor.