Finding a mean of biased sample

52 Views Asked by At

An RV $X$ follows a uniform distribution over $[0,1]$.

Suppose that we cannot observe the realization $x$ of $X$.

Instead, we can observe a value $y$, meaning that $x\in\{y,\frac{1}{2}+\frac{y}{2},\frac{3}{4}+\frac{y}{2}\}$. That is, the true $x$ is one of the three value.

If so, what is the conditional expectation of $X$ given an obersvation of $y\in[0,\frac{1}{2}]$?

I initially thought it should be $$\frac{1}{3}y+\frac{1}{3}\left(\frac{1}{2}+\frac{y}{2}\right)+\frac{1}{3}\left(\frac{3}{4}+\frac{y}{2}\right)$$

but it looks to be wrong somewow and I can't figure out why..

Any idea about the conditional probabilities and mean?

1

There are 1 best solutions below

0
On BEST ANSWER

First off, the problem is far less confusing if you state given the value $x$, what values of $y$ can you observe. For example, if $y=1$, then as you put it, $x$ is 1, 1, or 5/4. The last is impossible, and what does two equivalent possibilities mean?

Next, the problem is underdetermined unless you know the probabilities of observing the three different types. You're using 1/3 for each, but it is an unstated assumption.

To have a solvable problem, suppose that the three cases $X=Y$, $X=\frac{1}{2}+\frac{Y}{2}$, and $X=\frac{3}{4}+\frac{Y}{2}$ happen with 1/3 probability each independent of $X$.

First invert these relations to have $Y_1=X$, $Y_2=2X-1$, and $Y_3=2X-\frac{3}{2}$. We have the mixture distribution $$f_Y(y)=\Pr(X=Y)f_{Y_1}(y)+\Pr\left(X=\frac{1}{2}+\frac{Y}{2}\right)f_{Y_2}(y)+\Pr\left(X=\frac{3}{4}+\frac{Y}{2}\right)f_{Y_3}(y)\text{.}$$

Since $Y_1=X$, it is clear that $f_{Y_1}(y)=1$ for $0\leq y\leq1$. The monotonic change of variables formula gives $$f_{Y_2}(y)=\left|\frac{dx}{dy}\right|f_X(x)$$ when $x=\frac{1}{2}+\frac{y}{2}$, so $f_{Y_2}(y)=\frac{1}{2}$ for $-1\leq y\leq1$. Similarly $f_{Y_3}(y)=\frac{1}{2}$ for $-\frac{3}{2}\leq y\leq\frac{1}{2}$.

Writing down the conditional expectation formally is a pain because of the mixed discrete/continuous distributions, so just consider all the possibilities. When $0\leq y\leq\frac{1}{2}$, all 3 cases are applicable, so $f_Y(y)=2/3$.

With relative probability $\Pr(X=Y)f_{Y_1}(y)=1/3$ we are in the $X=Y$ case (so $X=y$), and with relative probability $\Pr\left(X=\frac{1}{2}+\frac{Y}{2}\right)f_{Y_2}(y)=1/6$ we are in the $X=\frac{1}{2}+\frac{Y}{2}$ case (so $X=\frac{1}{2}+\frac{y}{2}$); similarly for the third.

Thus the conditional expectation is $$\frac{\frac{1}{3}(y)+\frac{1}{6}(\frac{1}{2}+\frac{y}{2})+\frac{1}{6}(\frac{3}{4}+\frac{y}{2})}{\frac{2}{3}}=\frac{1}{2}y+\frac{1}{4}\left(\frac{1}{2}+\frac{y}{2}\right)+\frac{1}{4}\left(\frac{3}{4}+\frac{y}{2}\right)$$

Recap: your comment has the right answer, but your reasoning on the intervals is backwards. The first possibility $X=Y$ has $Y$ in an interval of length 1; the second $X=\frac{1}{2}+\frac{Y}{2}$ requires $Y$ to spread over an interval of length 2 so each range of values of $Y$ is half as likely, in the weighted average that is the conditional expectation.