How can I calculate the range of correlation of two variables X and Z given I have the correlations of X and Y, and Y and Z?
I've found a few resources around, namely this, but I'd like a research paper (if any).
Thanks!
How can I calculate the range of correlation of two variables X and Z given I have the correlations of X and Y, and Y and Z?
I've found a few resources around, namely this, but I'd like a research paper (if any).
Thanks!
On
The average of the three correlations $\rho_{X,Y}$, $\rho_{Y,Z}$, and $\rho_{X,Z}$ must be $-\frac{1}{2}$ or more (cf. this answer on stats.SE) and so $$\max\left\{-1, -\frac{3+ 2\rho_{X,Y} +2 \rho_{Y,Z}}{2}\right\} \leq \rho_{X,Z} \leq 1.$$
The correlation is the cosine of an angle.
Let $$ \bar x = \frac{x_1+\cdots+x_n}{n}\text{ and }\bar y = \frac{y_1+\cdots+y_n}{n}. $$ Then the correlation is the cosine of the angle between these two vectors: $$ (x_1-\bar x, \ldots, x_n-\bar x)\text{ and } (y_1-\bar y, \ldots, y_n-\bar y). $$
If you have the angles between $x$ and $y$ and between $y$ and $z$, then the angle between $x$ and $z$ cannot exceed the sum of those two, nor can it be less in absolute value than the difference between those two. It can be anywhere in between.
There is a book by Danny Kaplan that has a chapter about this. I think the word "statistics" is in the title.