Correlation between three variables question

55.2k Views Asked by At

I was asked this question regarding correlation recently, and although it seems intuitive, I still haven't worked out the answer satisfactorily. I hope you can help me out with this seemingly simple question.

Suppose I have three random variables $A$, $B$, $C$. Is it possible to have these three relationships satisfied? $$ \mathrm{corr}[A,B] = 0.9 $$ $$ \mathrm{corr}[B,C] = 0.8 $$ $$ \mathrm{corr}[A,C] = 0.1 $$ My intuition is that it is not possible, although I can't see right now how I can prove this conclusively.

4

There are 4 best solutions below

0
On

You can use the fact, that correlations can be understood as cosines between vectors from the common origin. Then apply the arccos-function, and check, whether all possible pairwise sums are greater than the third angle, such that they make a tetraeder. I get

[acos(0.9),acos(0.8),acos(0.1)]
 %1695 = [0.451026811796, 0.643501108793, 1.47062890563]

The sum of the first and the second is smaller than the third, so that combination cannot stem from a trivariate correlation.

10
On

Assume without loss of generality that the random variables $A$, $B$, $C$ are standard, that is, with mean zero and unit variance. Then, for any $(A,B,C)$ with the prescribed covariances, $$\mathrm{var}(A-B+C)=\mathrm{var}(A)+\mathrm{var}(B)+\mathrm{var}(C)-2\mathrm{cov}(A,B)-2\mathrm{cov}(B,C)+2\mathrm{cov}(A,C), $$ that is, $$ \mathrm{var}(A-B+C)=3-2\cdot0.9-2\cdot0.8+2\cdot0.1=-0.2\lt0, $$ which is absurd.

Edit: Since correlations are cosines, for every random variables such that $\mathrm{corr}(A,B)=b$, $\mathrm{corr}(A,C)=c$ and $\mathrm{corr}(B,C)=a$, one must have $$ a\geqslant bc-\sqrt{1-b^2}\sqrt{1-c^2}. $$ For $b=0.9$ and $c=0.8$, this yields $a\geqslant.458$.

0
On

As a follow up to one of the other answers, I'd like to finish the solution to make it clear. I find this approach very nice (Borat accent).

In order to generate the correlation matrix, you can leave one of the correlations as $\rho$, I'm leaving $\rho_{AC}$ here. \begin{bmatrix} 1 & 0.9 & \rho\\ 0.9 & 1 & 0.8\\ \rho & 0.8 & 1 \end{bmatrix}

Now, if you calculate the determinant, you'll get the bound for $\rho$. Determinant calculation would result in the following equation (note that the matrix above is a correlation matrix and has to be at least semidefinite):

$$ -0.45 + 0.72\rho + 0.72\rho - \rho^2 \ge 0 $$

making it nicer, it will look like this: $$ \rho^2 - 1.44\rho + 0.45 \le 0 $$

which is correct if $ 0.458 \le \rho \le 0.981 $. This means that the correlation between $A$ and $C$ should be in this range, which isn't.

0
On

For @egreif1 answer, it can be easily understood that $\rho(X, Y) = \rho(X - EX, Y)$, after shifting and scaling ,the correlation between two variables are same. So for the purpose of simplification, we can work with zero mean and unit variance r.v.s. It may be another thought about Hilbert's space.