Why use t distribution when we are already using Bessel correction?

498 Views Asked by At

As far as I understand it, when estimating the population mean from a sample without knowing the population standard deviation $\sigma$, we cant use the $Z$-test. According to the Central Limit Theorem, the sampling distribution of sample means is a normal distribution with mean${}=\mu$ and variance${} = \sigma / \sqrt n$. But the sample standard deviation $s = \frac {\sum(x_i - \mu)^2}{n} $ underestimates the true population parameter $\sigma$ (i.e. $s < \sigma$). For that reason, we cannot apply Central Limit Theorem using the sample standard deviation directly.

But we use Bessel's correction precisely for this reason! Ww write $s = \frac {\sum(x_i - \mu)^2}{n-1} $ so that now $s ≈ \sigma$. My question is, after applying Bessel Correction, why cant we use Z test directly to estimate population mean $\mu$ ?

The $T$-distribution is a bit flatter than the $Z$-distribution. This essentially says the fact that the population variance is a bit more than the variance we estimated from the sample. But did we not already take that into consideration by applying Bessel correction?

Now another question arises in this context. From the Central Limit Theorem, the sampling distribution is essentially a Normal distribution and nothing else. Never any other distribution, and certainly NOT a $T$-distribution. Just because we may fail to estimate the variance does not mean you should change the sampling distribution of sample means from a $Z$-distribution to a $T$-distribution. If instead of the $Z$-distribution, you would have taken some other Normal distribution with variance "a bit more than 1", at least that would have made sense. A $T$-Distribution looks like a normal distribution, but is NOT a $Z$-distribution with variance increased by some amount. Just because there may be some uncertainty in determination of the $\sigma$, why are you assuming that, some entirely other distribution should better approximate the distribution of sample means?

Note: I have already looked into the following answers and they do not satisfactorily answer my question

Estimating population SD when calculating t-statistic

When the population variance is unknown, we should use t-distribution.

3

There are 3 best solutions below

10
On

But the sample standard deviation $s = \frac {\sum(x_i - \mu)^2}{n} $ underestimates the true population parameter $\sigma$ (i.e. $s < \sigma$).

Attention to some details is needed here.

  • You first said $\mu$ is the mean of the population from which the sample is drawn.
  • But then you said $\frac{\sum(x_i-\mu)^2} n$ is the sample standard deviation. That is wrong. You should distinguish between the population mean $\mu$ and the sample mean $\overline x.$ The latter is different for different for different samples of $n$ observations; the former is not. The sample variance, not the sample standard deviation, is $\frac{\sum (x_i-\overline x)^2} n,$ and that differs from $\frac{\sum(x_i-\mu)^2} n.$
  • This should be called $s^2,$ not $s,$ and (as mentioned) it's the sample variance, not the sample standard deviation.
  • This sample variance on average, underestimates $\sigma^2.$ Note: $\sigma^2,$ not $\sigma.$
  • That is not the same as saying $s^2<\sigma^2;$ rather it means $\operatorname E(s^2) < \sigma^2.$

But we use Bessel's correction precisely for this reason! Ww write $s = \frac {\sum(x_i - \mu)^2}{n-1} $ so that now $s ≈ \sigma$.

  • No. We write $s^2$ (not $s$) ${} = \frac{\sum(x_i-\overline x)^2)}{n-1},$ not $\frac{\sum(x_i-\mu)^2} n,$ and $\operatorname E(s^2)$ (not $\operatorname E(s)$) ${}=\sigma^2$ (not ${} =\sigma$), and $\operatorname E(s^2),$ not just $s^2.$

My question is, after applying Bessel Correction, why cant we use Z test directly to estimate population mean $\mu$?

We have $s^2= \sum(x_i-\overline x)^2/(n-1).$

We know that

$$ \frac{\overline x - \mu}{\sigma/\sqrt n} \sim \operatorname N(0,1). \tag 1 $$

But $$ \frac{\overline x - \mu}{s/\sqrt n} \sim t_{n-1}. \tag 2 $$

The latter is used for deriving confidence intervals and hypothesis tests on $\mu$ because we cannot observe $\sigma,$ whereas we can observe $s.$

We need a pivotal random variable in which the only unobservable quantity is $\mu.$ $\text{“}$Pivotal$\text{''}$ means its probability distribution does not depend on unobservables, and that the only unobservable taken into account in finding the value of the pivotal quantity is the one for which we want a confidence interval or a hypothesis test.

0
On

My first posted answer spent a lot of time on addressing errors. This one will concentrate on the fact that the topic of Bessel's correction in this context is really altogether a separate thing from the topic of how the t-distribution enters this problem.

We have:

  • $X_1,\ldots,X_n\sim\text{i.i.d.} \operatorname N(\mu,\sigma^2).$
  • $\overline X = (X_1+\cdots+X_n)/n.$
  • $S^2 = \big( (X_1-\overline X_n)^2 + \cdots + (X_n-\overline X)^2 \big)/(n-1)$
  • $U^2 = \big( (X_1-\overline X_n)^2 + \cdots + (X_n-\overline X)^2 \big)/n$

Now recall that $$ T= \frac{\overline X- \mu}{S/\sqrt n} \sim t_{n-1}. \tag 1 $$ Now got to our tables or our software and find the number $A$ for which $$ \Pr(-A<T<A) = 0.9 $$ and conclude that $$ \Pr\left( \overline X- A\frac S {\sqrt n} < \mu < \overline X + A\frac S{\sqrt n} \right) = 0.9. $$ But what if we did not use Bessel's correction? So we use $U$ instead of $S.$

We have $$ U = S\cdot \sqrt{\frac {n-1} n} $$ and therefore $$ \sqrt{\frac{n-1} n} \cdot \frac{\overline X - \mu}{U/\sqrt n} = \frac{\overline X - \mu}{S/\sqrt n}, $$ and so $$ -A < \sqrt{\frac{n-1} n} \cdot \frac{\overline X - \mu}{U/\sqrt n} < A $$ $$ -A\sqrt{\frac n {n-1}} < \frac{\overline X - \mu}{U/\sqrt n} < A\sqrt{\frac n {n-1}}. $$ $$ -B < \frac{\overline X - \mu}{U/\sqrt n} < B, $$ $$ \Pr\left( -B \frac U {\sqrt n} <\mu < B\frac U {\sqrt n} \right) = 0.9. $$ This is exactly the same interval that we got using Bessel's correction. We could have simply designed our software and our tables to give us this number $B$ instead of the number $A$ that we get from the tables now used, and then proceeded without Bessel's correction.

So the use of Bessel's correction is an altogether separate issue from the problem of how to adjust the size of the confidence interval for the uncertainty in estimating $\sigma.$

0
On

The issue is not that the sample standard deviation $S$ tends to underestimate the population standard deviation. The issue is that $S$ is a random variable rather than a constant, so its value fluctuates, and correspondingly $$ T = \frac{\bar X - \mu}{S/\sqrt{n}} $$ fluctuates more (has a larger variance) than $$ Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}}. $$ So $T$ has longer tails than $Z$.