Sampling Distribution of Difference of Normal Random Variables

754 Views Asked by At

Let $X_1,X_2,...,X_m$ be i.i.d. from a $N(\mu_1,\sigma_1^2)$ distribution, and let $Y_1,Y_2,...,Y_n$ be i.i.d. from a $N(\mu_2,\sigma_2^2)$ distribution, and let the $X_i$'s be independent from the $Y_j$'s. Determine the sampling distribution of the following quantity:

$$Q=\frac{(\bar X-\bar Y)-(\mu_1-\mu_2)}{\sqrt{\frac{\sigma_1^2}{m}+\frac{\sigma_2^2}{n}}}$$

Where $\bar X$ and $\bar Y$ denote the respective sample means.

Now, this related to the fact that the difference between independent normal distribution is also normal... however, my confusion is the fact that $n$ is not necessarily equal to $m$. So my intuition tells me that $Q$ converges in distribution to $N(0,1)$ by central limit theorem, but that's just a completely unjustified guess that I'm not certain about at all. So my question is this: what is the sampling distribution of $Q$, and how do you justify it?

3

There are 3 best solutions below

0
On BEST ANSWER

It seems that you know that the sum of Gaussian random variables is Gaussian, so I'm guessing that you also know that a Gaussian RV plus/times a scalar is Gaussian. So $Q$ is clearly Gaussian. The mean is: $$\Bbb E[Q] = \frac{1}{\sqrt{\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}}}\Bbb E[\bar X - \bar Y - \mu_1 + \mu_2] = 0$$

The variance is: \begin{align} \Bbb V[Q] &= \frac{1}{\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}}\Bbb V[\bar X - \bar Y - \mu_1 +\mu_2] \\ &= \frac{1}{\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}}\Bbb V[\bar X - \bar Y] \\ &= \frac{1}{\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}}(\Bbb V[\bar X] + \Bbb V[\bar Y]) \\ &= \frac{1}{\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}} \left(\frac{\sigma_1^2}{m} + \frac{\sigma_2^2}{n}\right)\\ &= 1 \end{align}

So $Q \sim \mathcal N\left(0, 1\right)$.

0
On

Note that $\overline{X}$ is the mean of $X_1,X_2,...,X_m$. Therefore $\overline{X}-\mu_1$ is $N(0,(\sigma_1/\sqrt{m})^2)$. Similarly for $\overline{Y}-\mu_2$. The difference of two normals sums the variance, so $(\overline{X}-\overline{Y})-(\mu_1-\mu_2)$ is $N(0,(\sigma_1/\sqrt{m})^2+(\sigma_2/\sqrt{n})^2$. It should be clear now that you are just normalizing a zero-mean normal distribution by its standard deviation, resulting in a unit-normal distribution.

0
On

The sum of independent normal random variables is a normal random variable whose mean is the sum of means, and variance the sum of variances, vis:$$U\sim\mathcal (\mu_{\small U},\sigma^2_{\small U}), V\sim\mathcal (\mu_{\small V},\sigma^2_{\small V}) \implies U+V\sim \mathcal U(\mu_{\small U}+\mu_{\small V},\sigma^2_{\small U}+\sigma^2_{\small V})\\(Z_k)_{j=1}^k\overset{iid}\sim\mathcal N(\mu,\sigma^2)\implies \sum_{j=1}^k Z_j\sim \mathcal N(k\mu,k\sigma^2)$$

The distribution of a scalar product of a normal random variable is a scaled normal random variable, vis:$$Z\sim\mathcal N(\mu\sigma^2)\implies (aZ+b)\sim\mathcal N(a\mu+b, a^2\sigma^2) $$

So since $\overline X=\frac 1m\sum_{k=1}^m X_k $ and such, apply these rules to find the distribution for $Q$.