Trying to understand correlation and independence geometrically

1.7k Views Asked by At

I am trying to understand correlation and independence of two random variables geometrically, but found it difficult to grasp the intuition and to explain it rigorously.

First, given two uniformly distributed RVs X,Y, and given the distribution of (X,Y) and (-X,Y) are the same, how can we show and explain $\mathbb{E}XY=\mathbb{E}(-X)Y=-\mathbb{E}XY$. From this equality, is there any way to show whether X,Y are independent or not, uncorrelated or not?

Suppose consider a point with cartesian coordinates (X,Y), on the unit circle with uniformly distributed X and Y. Are X,Y uncorrelated and why? Are X,Y independent and why?

My attempt to reason with the unit circle is that if we pick a point outside the circle then $0\not=\mathbb{E}X\mathbb{E}Y\not=\mathbb{E}XY=0$, am I right?

How about on the unit square?

Sorry for so many questions at once, I am trying to understand the concept as a whole.

Thanks.

2

There are 2 best solutions below

1
On BEST ANSWER

I will answer your question about the unit circle. I think what you meant to say is that the point with coordinates $(X,Y)$ is chosen uniformly from the unit circle. This doesn't mean that $X$ and $Y$ are themselves uniform. In fact, they're not.

$X$ and $Y$ are not independent. Intuitively, two variables are independent if knowing something about one of the variables doesn't affect the probability distribution of the other variable (more precisely, the conditional density of $Y$ given $X=x$ is the same as the density of $Y$). But here, it does. For example, if we know $X=0$, then the conditional distribution of $Y$ is uniform between $-1$ and $1$ (think of a vertical cross-section at $X=0$). But if we know $X=3/5$, then the conditional distribution of $Y$ is uniform between $-4/5$ and $4/5$ (vertical cross-section at $X=3/5$). Thus, knowing things about $X$ changes the distribution of $Y$, so $X$ and $Y$ are not independent.

$X$ and $Y$ are uncorrelated. Here it's best to use the formula for correlation: $\rho_{X,Y}=E((X-\mu_x)(Y-\mu_y))/\sigma_x\sigma_y$. In the case of the circle, $X$ and $Y$ are symmetric about the origin, so this expectation evaluates to $0$.

Intuitively, two variables are positively correlated if when you know one of them is relatively large you also expect the other to be relatively large. Two variables are negatively correlated if when you know one of them is relatively large then you expect the other one to be relatively small. Two variables are uncorrelated if when you know something about how small or large one of them is, it doesn't change what you expect about how small or large the other one is.

According to this intuition, we might have guessed that $X$ and $Y$ are indeed uncorrelated since no matter what $x$ is, the expected value of $Y$ given that $X=x$ is always $0$ (since $Y$ is symmetric about $0$). Caution: this intuition shouldn't always be trusted. At the end of the day, you should back up your intuition with calculations.

Note: independent always implies uncorrelated, but not the other way around (as this example shows)

1
On
  1. Correlation is a measure of linear relationship. In other words, the correlation coefficient measures the extent to which the two random variables $X$ and $Y$ fall on a straight line.

  2. Suppose you were to draw a contour plot of the joint density $f(x,y)$ and you observe that the pattern of lighter and darker regions are identical for every value of $x$ then the variables are independent.

    Another way to understand independence is to imagine the joint density function as a mountain. The two variables are independent if the cross-section of the mountain is identical for every value of $x$.