How does the Pearson correlation coefficient change under rotations

1.1k Views Asked by At

I was reading on wikipedia about the pearson correlation coefficient. Assuming the data has zero mean it can be written as

$$ \rho = \frac{ \sum x_i y_i } {\sqrt{\sum x_i^2 \sum y_i^2}} $$

The caption below this image says:

[...] Note that the correlation reflects the non-linearity and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). [...]

(bold text emphasis added by me)

The middle row of the picture shows several distributions that are perfectly correlated ($\rho=1$) and illustrates, that in that case the correlation coefficient does not change when the slope changes (apart from the case if either $x$ or $y$ is constant).

However, I have my doubts whether the correlation coefficient is really independent of the slope, when the correlation is not perfect (ie $\rho<1$). In other words, how does the correlation coefficient change, when I apply a simple rotation

$$ x'_i = x_i \cos(\alpha) + y_i \sin(\alpha) \\ y'_i = -x_i \sin(\alpha) + y_i \cos(\alpha) $$

to the data?

Note that the rotation does not change the mean values if $\sum x = \sum y = 0$, but even in the simple form as written above I didn't manage to derive an expression for

$$ \rho(\alpha) = ?? $$

yet. Or maybe I am just a bit confused and the correlation coefficient really does not change....

2

There are 2 best solutions below

5
On BEST ANSWER

I suppose that $\sum_i x_i = \sum_i = y_i = 0$. Moreover, $P_x = \sum_i x_i^2$, $P_y = \sum_i y_i^2$ and $C_{xy} = \sum_i x_i y_i$.

Then, the sample Pearson coefficient $\rho$ based on data $x_i$ and $y_i$ produced by random variables $X$ and $Y$ is:

$$\rho = \frac{C_{xy}}{\sqrt{P_x P_y}}.$$

Notice that:

$$P_{x'} = \sum_i {x'}^2_i = \sum_i (x_i \cos \alpha + y_i \sin \alpha)^2 = \\ \cos^2 \alpha\sum_i x_i^2 + \sin^2 \alpha\sum_i y_i^2 + 2\sum_i x_i y_i \sin \alpha \cos \alpha = \\\cos^2 \alpha P_x + \sin^2 \alpha P_y + \sin(2\alpha) C_{xy},$$

$$P_{y'} = \sum_i {y'}^2_i = \sum_i (-x_i \sin \alpha + y_i \cos \alpha)^2 = \\ \sin^2 \alpha\sum_i x_i^2 + \cos^2 \alpha\sum_i y_i^2 - 2\sum_i x_i y_i \sin \alpha \cos \alpha = \\\sin^2 \alpha P_x + \cos^2 \alpha P_y - \sin(2\alpha) C_{xy},$$

and

$$C_{x'y'} = \sum_i x_i' y_i' = \sum_i (x_i\cos \alpha + y_i \sin \alpha)( -x_i \sin \alpha + y_i \cos \alpha) = \\ -\sum_i x_i^2\sin\alpha\cos \alpha + \sum_i x_i y_i (\cos^2 \alpha - \sin^2 \alpha) + \sum_i y_i^2\sin\alpha\cos \alpha = \\ \frac{1}{2}\sin(2\alpha)(P_y - P_x) + C_{xy} \cos(2\alpha).$$

Consider $\alpha = \frac{\pi}{2}$ and join all pieces togheter:

$$\rho' = \frac{C_{x'y'}}{\sqrt{P_{x'} P_{y'}}} = \frac{\frac{1}{2}\sin(2\frac{\pi}{2})(P_y - P_x) + C_{xy} \cos(2\frac{\pi}{2})}{\sqrt{(\cos^2 \frac{\pi}{2} P_x + \sin^2 \frac{\pi}{2} P_y + \sin(2\frac{\pi}{2}) C_{xy})(\sin^2 \frac{\pi}{2} P_x + \cos^2 \frac{\pi}{2} P_y - \sin(2\frac{\pi}{2}) C_{xy})}} = \\ = \frac{-C_{xy}}{\sqrt{P_yP_x}} = - \rho. $$

Conclusion: rotation affects Peason coefficient.

Addition

In general, the new Pearson coefficient, as a function of $\alpha$, is

$$\rho' = \frac{C_{x'y'}}{\sqrt{P_{x'} P_{y'}}} = \frac{\frac{1}{2}\sin(2\alpha)(P_y - P_x) + C_{xy} \cos(2\alpha)}{\sqrt{(\cos^2 \alpha P_x + \sin^2 \alpha P_y + \sin(2\alpha) C_{xy})(\sin^2 \alpha P_x + \cos^2 \alpha P_y - \sin(2\alpha) C_{xy})}}. $$

4
On

I think there are big absents: the variance matrices which are, in all these questions, the central concept.

Using matrix-vector notations (instead of all-algebraic calculations), and assuming that we work on centered data, we have the following transformation:

$$X'=RX \ \ \text{with} \ \ R=\begin{bmatrix}\cos(\alpha)&-\sin(\alpha)\\\sin(\alpha)&\cos(\alpha)\end{bmatrix} \ \ \text{and}$$ $$X'=\begin{bmatrix}x'_1&x'_2&\cdots&x'_n\\y'_1&y'_2&\cdots&y'_n\end{bmatrix}, \ X=\begin{bmatrix}x_1&x_2&\cdots&x_n\\y_1&y_2&\cdots&y_n\end{bmatrix}$$

Thus $$X'X'^T=R(XX^T)R^T$$

In other words $$V'=RVR^T$$

(see "A more general identity" in this). by naming $V$ and $V'$, resp. the old and new (co)variance matrices.

which is the way (co)variance matrices are modified (a kind of generalisation of the property $var(aX)=a^2 var(X)$). Thus the new variance matrix is very different from the old variance matrix, even if one normalize each one in order to reason on correlation matrices instead of variance matrices.

The old formula $\rho=\dfrac{V_{12}}{\sqrt{V_{11}V_{22}}}$ is replaced by the new one : $\rho'=\dfrac{V'_{12}}{\sqrt{V'_{11}V'_{22}}}$

(see the final result of @the_candyman).