Sampling Distribution: Difference of Normal Random Variables

144 Views Asked by At

This is a self-study question. I am trying to complete this textbook problem.

Suppose that $X_{1}, X_{2}, \dots, X_{m}$ and $Y_{1}, Y_{2}, \dots, Y_{n}$ are independent random samples with $X_{i}$ and $Y_{i}$ being normally distributed with means $\mu_{1}$ and $\mu_{2}$ and variances $\sigma^2_{1}$ and $\sigma^2_{2}$, respectively. The difference between the sample means, $\bar{X} - \bar{Y}$, is then a linear combination of $m+n$ normally distributed random variables is also normally distributed. If $\sigma^2_{1} = 2, \sigma^2_{2} = 2.5$, and $m=n$, find the sample sizes so that $(\bar{X} - \bar{Y})$ will be within 1 unit of $(\mu_{1} - \mu_{2})$ with probability $0.95$.

I started by saying that $P(|\bar{X} - \bar{Y} - (\mu_{1} - \mu_{2})| \le 1) = 0.95$. I think I have to standardize the variables, but I am unsure on how to do that since I am working with two random variables at once.

The answer is $n = 17.29 \rightarrow 18$, but I am unsure how to get here.

2

There are 2 best solutions below

0
On BEST ANSWER

Let's look at the pieces

$$\begin{split}X&\sim N(\mu_1,\sigma_1^2)\\ Y&\sim N(\mu_2,\sigma_2^2)\end{split}$$

Then what the book means by $\bar X-\bar Y$ is a sum of m+n independent normal random variables is that $\bar X=\frac{\sum X}{m}=\sum {\frac{X}{m}}$ and likewise for $\bar Y$ with

$$\begin{split}\frac {X}{m}&\sim N\left(\frac{\mu_1}{m},\frac{\sigma_1^2}{m^2}\right)\\ \frac{Y}{n}&\sim N\left(\frac{\mu_2}{n},\frac{\sigma_2^2}{n^2}\right)\end{split}$$

As you know the sum of k independent normal random variables $X_1,...,X_k$ with means $\mu_1,...,\mu_k$ and variances $\sigma_1^2,...,\sigma_k^2$ is normally distributed with mean $\sum \mu$ and variance $\sum \sigma^2$

Also note that

$$-\frac{Y}{n}\sim N\left(-\frac{\mu_2}{n},\frac{\sigma_2^2}{n^2}\right)$$

Hence

$$\begin{split}\bar X-\bar Y&=\frac{X_1}{m}+...+\frac{X_m}{m}+\frac{Y_1}{n}+...+\frac{Y_n}{n}\\ &\sim N\left(m\cdot \frac{\mu_1}{m}-n\cdot \frac{\mu_2}{n}, m\cdot \frac{\sigma_1^2}{m^2}+n\cdot \frac{\sigma_2^2}{n^2}\right)\\ &= N\left(\mu_1-\mu_2, \frac{\sigma_1^2}{m}+\frac{\sigma_2^2}{n}\right)\end{split}$$

We seek

$$P(|\bar X-\bar Y-(\mu_1-\mu_2)|\le 1)\ge.95$$

Standardize via division by the standard error with m=n here

$$\begin{split}P\left(\left|\frac{\bar X-\bar Y-(\mu_1-\mu_2)}{\sqrt{\frac{\sigma_1^2+\sigma_2^2}{n}}}\right|\le \frac 1{\sqrt{\frac{\sigma_1^2+\sigma_2^2}{n}}}\right)&\ge.95\\ P(|Z|\le\sqrt{\frac{n}{\sigma_1^2+\sigma_2^2}})&\ge.95\end{split}$$

Using R we find that the critical value is

> qnorm(.975)
[1] 1.959964

Thus

$$\begin{split}\sqrt{\frac{n}{4.5}}&\ge1.96\\ n&\ge17.29\end{split}$$

Conservatively we pick n=18

0
On

Given independent random samples $X_{1}, X_{2}, \dots, X_{m}$ taken from a normal random variable with mean $\mu$ and variance $\sigma^2$, it is a standard result that the sample mean $\bar{X}$ is normally distributed with mean $\mu$ and variance $\sigma^2/m$. It follows that in your example $\bar{X}$ is normally distributed with mean $\mu_{1}$ and variance $\sigma^2_{1}/m$ and $\bar{Y}$ is normally distributed with mean $\mu_{2}$ and variance $\sigma^2_{2}/n$. It then follows that the difference $\bar{X} - \bar{Y}$ is normal with mean $(\mu_{1} -\mu_{2})$ and variance $(\sigma^2_{1}/m + \sigma^2_{2}/n)$. As $m=n$ and with the given values we find that $\bar{X} - \bar{Y}$ has variance $(4.5/n)$. After standardizing $P(-1<(\bar{X} - \bar{Y})-(\mu_{1} -\mu_{2})<1)=0.95$ we get $P((-1/\sqrt(4.5/n))- Z-(1/\sqrt(4.5/n)) = 0.95$ where $Z$ is the standard normal variable. The rest should be clear I hope.