Let x and y be two random variables such that:
Corr(x,y) = b, where Corr(x,y) represents correlation between x and y, b is a scalar number in range of [-1, 1]. Let y' be an estimation of y. An example could be y'=y+(rand(0,1)-0.5)*.1, rand(0,1) gives random number between 0, 1. I am adding some noise to the data.
My questions are:
- Is there a way where I can bound the correlation between x, y' i.e. Corr(x,y')?
- I mentioned y' in light of random perturbation, I would like to know what if I don't have that information, where I only know that y' is a estimation of y. Are there any literature that cover it?
Let $e=y'-y$. Assuming that $e$ is independent from $x$ and $y$ with $\mu_e=E(e)=0$, then $\mu_{y'}=E(y')=E(y)=\mu_y$ and:
$$\begin{align}\mathrm{Corr}(x,y') &= \frac{E((x-\mu_x)(y'-\mu_y))}{\sigma_x\sigma_{y'}}\\ &= \frac{E((x-\mu_x)(y-\mu_y)) + E((x-\mu_x)e)}{\sigma_x\sigma_y'}\\ &= \mathrm{Corr}(x,y)\frac{\sigma_y}{\sigma_{y'}}\end{align}$$
$E((x-\mu_x)e)=E(x-\mu_x)E(e)=0$ since $x$ and $e$ are independent.
Now, $\sigma_{y'} = \sqrt{\sigma_y^2 + \sigma_e^2}$, again by independence, so:
$$\mathrm{Corr}(x,y')=Corr(x,y) \frac{1}{\sqrt{1+ \left(\frac{\sigma_e}{\sigma_y}\right)^2}}$$
So definitely $|\mathrm{Corr}(x,y')|<|\mathrm{Corr}(x,y)|$.
I believe the specific $e$ you have given, we have $\sigma_e = \frac{0.1}{\sqrt{6}}$.
There is no meaning to "estimation" technically. You can always say that $y-y'$ is another random variable. If you don't know that $y'-y$ is independent of $x$, you don't know what $E((x-\mu_x)(e-\mu_e))$ is. If you don't know that $e=y'-y$ and $y$ are independent, you don't know $\sigma_{y'}$ in terms of $\sigma_{e}$ and $\sigma_{y}$. In particular, you don't know $\sigma_{y'}>\sigma_{y}$.
A simple example is that if $y'=x$ then $E(x,y')=1$. So if $x,y$ are close enough that $x$ can be said to be an "estimate for $y$" then $E(x,x)=1>E(x,y)$.