On a homework problem, I am given two variables, $x$ and $y$, with variances $4$ and $16$, respectively. The question is how many observations should I draw of $y$ in order to estimate the difference between the variables' means, if I am only allowed $30$ observations total?
I know the (very) basic idea is that I need more samples of $y$ than $x$, and I suspect that I am going to need to derive functions of $x$ and $y$ to plug into a Lagrangean, with 30 as my constraint. Apart from that, I'm lost. It doesn't seem to make sense to run the Lagrangean with $16y + 4x$ - \lambda$(1-30)$.
Apart from that, the only idea I can come up with is that I should take $25$ samples of $y$ and $5%$ of $x$, since the variance of $y$ is equal to the variance of $x$ squared. However, I doubt that (1) that answer is right or (2) it will satisfy my professor even if it is right.
Does anyone know how to get going on this?
Suppose you take $m$ samples of $x$, $x_1,\ldots,x_m$, and $n$ samples of $y$, $y_1,\ldots,y_n$.
Then, the estimated difference in the means is $\hat{x}-\hat{y} = \displaystyle\dfrac{1}{m}\sum_{i=1}^{m}x_i - \dfrac{1}{n}\sum_{j=1}^{n}y_j$.
You want to minimize $\text{Var}[\hat{x}-\hat{y}]$ subject to $m+n = 30$. Can you compute the variance of $\hat{x}-\hat{y}$ in terms of $m$ and $n$? Remember that the samples are drawn independently.