Pearson Correlation as a measure for non-linear dependence.

Question

Pearson Correlation as a measure for non-linear dependence.

227 Views Asked by Bumbble Comm At 02 Apr 2026 - 6:38

It is known that $\rho$, the pearson correlation, is a measure for the linear dependence of two random variables say $X$, $Y$. But can't you say just transform $X$ and $Y$ such that we have,

$$ \rho_{X,Y}(f(X),g(Y))$$ where $f$,$g$ are non-linear functions such that it measures other kinds of dependce (take for example $f(s)=g(s)=s^2$ for quadratic dependence).

Original Q&A

There are 2 best solutions below

Bumbble Comm On 09 Mar 2022 - 11:39

As you note, Pearson's correlation coefficient reflects linear dependence of random variables. It is invariant to changes in scale and location, i.e.

$$\rho(X,Y)=\rho(a+bX,c+dY)$$

for constants $a,b,c,d$ and $b,d>0$. It is not invariant to more general nonlinear transformations.

An alternative measure of dependence is mutual information. Intuitively, this captures how different a joint distribution of two random variables is from the product of their marginals, and thus reflects a broader notion of dependence (not just linear dependence). Moreover, mutual information is invariant to bijective transformations, i.e.

$$I(X;Y)=I(f(X);g(Y))$$

for bijections $f,g$.

**Bumbble Comm** · Accepted Answer

Consider the vector space $\mathrm{S} = \mathscr{L}^2(\Omega, \mathscr{F}, \mathbf{P})$ of square-integrable random variables. Here, you can introduce the inner-product $$ (X \mid Y) = \int\limits_\Omega XY d\mathbf{P}. $$ Suppose now that $X$ and $Y$ are two elements in $\mathrm{S}$ and we further assume they are standarised (a.k.a. studentised), in other words $\mathbf{E}(X) = 0$ and $\mathbf{V}(X) = 1,$ similar for $Y.$ Under these circumstances, $$ \mathrm{corr}(X,Y) = (X \mid Y). $$

Interpretation. We know from elementary linear algebra that the inner-product between two vectors is the cosine of the angle they form (in the plane generated by both). So, $\mathrm{corr}(X, Y) = 0$ means orthogonal in $\mathrm{S}$ and is equivalent to $\theta \in \{0, \pi\},$ where $\theta$ is the angle formed by $X$ and $Y.$ Of course, $\theta = 0$ means that $X$ and $Y$ point in the same direction while $\theta = \pi$ means that they point in opposite directions, in either case, $X$ and $Y$ are colinear, and this reduces to $Y = aX + b$ in the general setting (i.e. not necessarily standarised).

What people often confuse is that $\mathrm{corr}(X,Y) = 0$ should signify $Y = f(X)$ for some measurable function $f.$ In fact, $Y = f(X)$ if and only if $Y \in \mathrm{S}_X$ where $\mathrm{S}_X$ is the subspace of $\mathrm{S}$ generated by all bounded measurable images of $X,$ it can be shown that $\mathrm{S}_X = \mathscr{L}^2(\Omega, X^{-1}(\mathscr{B}_\mathbf{R}), \mathbf{P}).$ This is not the same as the span of $X,$ the span of $X$ is $\langle X \rangle = \mathbf{R} X = \{aX \mid a \in \mathbf{R}\}.$ In this linear algebra language, the correlation is then the orthogonal projection operator onto the span of $X.$ That is $$ \mathrm{pr}_{\langle X \rangle}(Y) = \mathrm{corr}(X, Y) X. $$ Furthermore, the function $Z \mapsto \mathbf{E}(Z \mid X)$ from $\mathrm{S} \to \mathrm{S}_X$ is the orthogonal projection from $\mathrm{S}$ onto $\mathrm{S}_X,$ which is the right object when we want functional dependency. Explictly, $$ \mathrm{pr}_{\mathrm{S}_X}(Z) = \mathbf{E}(Z \mid X). $$ As you can tell, the mathematical framework is crystal clear. If you want to show that $Y$ is not functionally dependent of $X$ you have to show that $Y \perp \mathrm{S}_X$ which is the same as $\mathbf{E}(Y \mid X) = 0.$ (Actually, the condition "$Y$ is functionally dependent on $X$" signifies $Y \in \mathrm{S}_X$, so its negation really is $Y \notin \mathrm{S}_X$ and not $Y \perp \mathrm{S}_X,$ this latter relation has no English translation, or not an obvious one, it is akin to "Nothing of $Y$ can be functionally dependent on $X$.") If you show that $\mathrm{corr}(X, Y) = 0,$ the you are just showing that $Y \perp \langle X \rangle,$ but as already mentioned above, many functions $f(X)$ are orthogonal to $X.$

Pearson Correlation as a measure for non-linear dependence.

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in STATISTICS

Related Questions in STATISTICAL-INFERENCE

Related Questions in CORRELATION

Trending Questions

Popular # Hahtags

Popular Questions