How does the Pearson correlation coefficient change under rotations

Question

How does the Pearson correlation coefficient change under rotations

1.1k Views Asked by Bumbble Comm At 30 Mar 2026 - 11:09

I was reading on wikipedia about the pearson correlation coefficient. Assuming the data has zero mean it can be written as

$$ \rho = \frac{ \sum x_i y_i } {\sqrt{\sum x_i^2 \sum y_i^2}} $$

The caption below this image says:

[...] Note that the correlation reflects the non-linearity and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). [...]

(bold text emphasis added by me)

The middle row of the picture shows several distributions that are perfectly correlated ($\rho=1$) and illustrates, that in that case the correlation coefficient does not change when the slope changes (apart from the case if either $x$ or $y$ is constant).

However, I have my doubts whether the correlation coefficient is really independent of the slope, when the correlation is not perfect (ie $\rho<1$). In other words, how does the correlation coefficient change, when I apply a simple rotation

$$ x'_i = x_i \cos(\alpha) + y_i \sin(\alpha) \\ y'_i = -x_i \sin(\alpha) + y_i \cos(\alpha) $$

to the data?

Note that the rotation does not change the mean values if $\sum x = \sum y = 0$, but even in the simple form as written above I didn't manage to derive an expression for

$$ \rho(\alpha) = ?? $$

yet. Or maybe I am just a bit confused and the correlation coefficient really does not change....

Original Q&A

There are 2 best solutions below

Bumbble Comm On 30 May 2016 - 11:34

I think there are big absents: the variance matrices which are, in all these questions, the central concept.

Using matrix-vector notations (instead of all-algebraic calculations), and assuming that we work on centered data, we have the following transformation:

$$X'=RX \ \ \text{with} \ \ R=\begin{bmatrix}\cos(\alpha)&-\sin(\alpha)\\\sin(\alpha)&\cos(\alpha)\end{bmatrix} \ \ \text{and}$$ $$X'=\begin{bmatrix}x'_1&x'_2&\cdots&x'_n\\y'_1&y'_2&\cdots&y'_n\end{bmatrix}, \ X=\begin{bmatrix}x_1&x_2&\cdots&x_n\\y_1&y_2&\cdots&y_n\end{bmatrix}$$

Thus $$X'X'^T=R(XX^T)R^T$$

In other words $$V'=RVR^T$$

(see "A more general identity" in this). by naming $V$ and $V'$, resp. the old and new (co)variance matrices.

which is the way (co)variance matrices are modified (a kind of generalisation of the property $var(aX)=a^2 var(X)$). Thus the new variance matrix is very different from the old variance matrix, even if one normalize each one in order to reason on correlation matrices instead of variance matrices.

The old formula $\rho=\dfrac{V_{12}}{\sqrt{V_{11}V_{22}}}$ is replaced by the new one : $\rho'=\dfrac{V'_{12}}{\sqrt{V'_{11}V'_{22}}}$

(see the final result of @the_candyman).

**Bumbble Comm** · Accepted Answer

I suppose that $\sum_i x_i = \sum_i = y_i = 0$. Moreover, $P_x = \sum_i x_i^2$, $P_y = \sum_i y_i^2$ and $C_{xy} = \sum_i x_i y_i$.

Then, the sample Pearson coefficient $\rho$ based on data $x_i$ and $y_i$ produced by random variables $X$ and $Y$ is:

$$\rho = \frac{C_{xy}}{\sqrt{P_x P_y}}.$$

Notice that:

$$P_{x'} = \sum_i {x'}^2_i = \sum_i (x_i \cos \alpha + y_i \sin \alpha)^2 = \\ \cos^2 \alpha\sum_i x_i^2 + \sin^2 \alpha\sum_i y_i^2 + 2\sum_i x_i y_i \sin \alpha \cos \alpha = \\\cos^2 \alpha P_x + \sin^2 \alpha P_y + \sin(2\alpha) C_{xy},$$

$$P_{y'} = \sum_i {y'}^2_i = \sum_i (-x_i \sin \alpha + y_i \cos \alpha)^2 = \\ \sin^2 \alpha\sum_i x_i^2 + \cos^2 \alpha\sum_i y_i^2 - 2\sum_i x_i y_i \sin \alpha \cos \alpha = \\\sin^2 \alpha P_x + \cos^2 \alpha P_y - \sin(2\alpha) C_{xy},$$

and

$$C_{x'y'} = \sum_i x_i' y_i' = \sum_i (x_i\cos \alpha + y_i \sin \alpha)( -x_i \sin \alpha + y_i \cos \alpha) = \\ -\sum_i x_i^2\sin\alpha\cos \alpha + \sum_i x_i y_i (\cos^2 \alpha - \sin^2 \alpha) + \sum_i y_i^2\sin\alpha\cos \alpha = \\ \frac{1}{2}\sin(2\alpha)(P_y - P_x) + C_{xy} \cos(2\alpha).$$

Consider $\alpha = \frac{\pi}{2}$ and join all pieces togheter:

$$\rho' = \frac{C_{x'y'}}{\sqrt{P_{x'} P_{y'}}} = \frac{\frac{1}{2}\sin(2\frac{\pi}{2})(P_y - P_x) + C_{xy} \cos(2\frac{\pi}{2})}{\sqrt{(\cos^2 \frac{\pi}{2} P_x + \sin^2 \frac{\pi}{2} P_y + \sin(2\frac{\pi}{2}) C_{xy})(\sin^2 \frac{\pi}{2} P_x + \cos^2 \frac{\pi}{2} P_y - \sin(2\frac{\pi}{2}) C_{xy})}} = \\ = \frac{-C_{xy}}{\sqrt{P_yP_x}} = - \rho. $$

Conclusion: rotation affects Peason coefficient.

Addition

In general, the new Pearson coefficient, as a function of $\alpha$, is

$$\rho' = \frac{C_{x'y'}}{\sqrt{P_{x'} P_{y'}}} = \frac{\frac{1}{2}\sin(2\alpha)(P_y - P_x) + C_{xy} \cos(2\alpha)}{\sqrt{(\cos^2 \alpha P_x + \sin^2 \alpha P_y + \sin(2\alpha) C_{xy})(\sin^2 \alpha P_x + \cos^2 \alpha P_y - \sin(2\alpha) C_{xy})}}. $$

How does the Pearson correlation coefficient change under rotations

There are 2 best solutions below

Related Questions in ROTATIONS

Related Questions in CORRELATION

Trending Questions

Popular # Hahtags

Popular Questions