The distribution of standardized Gaussian variables

42 Views Asked by At

Suppose we have a Gaussian distribution $\mathcal{X} = \mathcal{N}(\mu, \sigma^2)$ and a positive integer $k \in \mathbb{N}$, we do the following.

  1. First, we sample $k$ numbers $x_1, x_2, \ldots, x_k$ independently with $x_i \sim \mathcal{X}, \forall i \in [k]$
  2. Then, we compute the mean $\tilde{\mu}$ and variance $\tilde{\sigma}^2$ of the samples $x_1, x_2, \ldots, x_k$ and do the standardization $z_i = \frac{x_i - \tilde{\mu}}{\tilde{\sigma}}$.

Question: What is the distribution of $z_i$'s? Is it $\mathcal N(0, 1)$?

My thoughts: $\color{red}{\text{First of all, $z_i$'s are i.i.d. since $x_i$'s are i.i.d. (this is wrong.)}}$ As @jwhite pointed out, instead, $z_i$'s are only i.i.d. with given $\tilde{\mu}$ and $\tilde{\sigma}$ but are not i.i.d. in general. Also, we know that it is true if $\tilde{\mu} = \mu$ and $\tilde{\sigma} = \sigma$, and thus by CLT it should be true when $k \to \infty$, i.e., $$\lim_{k \to \infty} \frac{x_i - \tilde{\mu}}{\tilde{\sigma}} = \frac{x_i - \mu}{\sigma} \sim \mathcal N(0, 1).$$ However, I am not sure about general $k$ values.

1

There are 1 best solutions below

0
On BEST ANSWER

The $Z_i$ are not independent because the sample mean $\tilde{\mu}$ and the sample standard deviation $\tilde{\sigma}$ depend on all $x_j$. The $\tilde{\mu}$ and $\tilde{\sigma}$ are also random and, with probability $1$, will not be equal to $\mu$ or $\sigma$. These are the student $t$-distribution. Their relation to standardized normal observations can be found in the "How the $t$ distribution arises" of this page.

https://en.wikipedia.org/wiki/Student%27s_t-distribution