I'm currently messing around with confidence intervals and I can't really understand how a $t$ distribution converges to a normal distribution for large $n$.
For example, suppose we want to construct a $95\%$ confidence interval when we have a sample mean $\bar{X} = 74.8$ and sample variance $S = 1.23$ with $n = 143$.
I would construct the confidence interval using,
$$(\bar{X} - z_{1 - \frac{\alpha}{2}} \frac{S} {\sqrt{n}},\bar{X} + z_{1 - \frac{\alpha}{2}} \frac{S} {\sqrt{n}})$$
since $n>30$. If $n<30$ I would have used a $t$-distribution.
My question is why does the $t$ - distribution approach a normal distribution for relatively large $n$?
The $t$ distribution arises because you estimate the population standard deviation $\sigma$ by the sample standard deviation $S$. For smaller $n$, there is some significant chance that $S$ is quite a bit smaller than $\sigma$; thus for fixed $c$ and $n$, there is significant probability that $\overline{X}$ is within $c \sigma$ of $\mu$ and not within $cS$ of $\mu$. (The reverse is possible too, but less likely.) But for large $n$, $S$ is essentially guaranteed to be very close to $\sigma$, because it is an asymptotically consistent estimator for $\sigma$. And of course, if $S$ is close to $\sigma$, $\overline{X}$ being within $c\sigma$ and being within $cS$ of $\mu$ are nearly equivalent.
Note that strictly speaking you should always use the $t$ distribution for confidence intervals from a normally distributed population with unknown standard deviation. It is just negligibly different from the same interval constructed with the normal distribution if $n$ is large enough. How large $n$ needs to be really depends on how small a difference can be treated as negligible.