Correlation between linearly transformed vectors

117 Views Asked by At

Consider $x$ and $y$ two $N\times 1$ complex vectors, and $T$ a $N\times N$ complex random matrix. Each element of $T$ is chosen in a complex normal distribution. Let us also define the (Pearson) correlation coefficient between $x$ and $y$ as $$C(x,y)=\frac{x^\dagger y}{ \sqrt{ (x^\dagger x)( y^\dagger y)}}, $$ assuming $x$ and $y$ have zero mean ($\dagger$ is the conjugate transpose). Now I find numerically that $C(x,y)$ is very close to $C(Tx,Ty)$:

$$ \frac{x^\dagger y}{ \sqrt{ (x^\dagger x)( y^\dagger y)}} \approx \frac{x^\dagger T^\dagger T y}{ \sqrt{ (x^\dagger T^\dagger T x)( y^\dagger T^\dagger T y)}}.$$

How can we prove this? It would be easy to prove if we had $T^\dagger T\approx I$, but this doesn't seem quite right, $T^\dagger T$ is indeed close to diagonal, but for example $x^\dagger T^\dagger T x$ is not close to $x^\dagger x$. By numerically generating many realisations of $x$, $y$, and $T$ and computing those quantities, I find that $C(x,y)$ vs $C(Tx,Ty)$ is very well fitted by the identity function (R2 of 0.97), while $x^\dagger x$ vs $x^\dagger T^\dagger T x$ is not (R2 of 0.5).

2

There are 2 best solutions below

0
On

$T$ is a simulation of N i.i.d N dimensional random variables, therefore $T^\dagger T$ is a simulation of scaled identity matrix $NI$. Therefore your conclusion holds. The reason of low R2 for $x^\dagger x$ is due to the scaling factor N.

2
On

Let $\mathcal N(\mu, \sigma^2)$ be the distribution of each entry of $T$. You can easily prove by the law of large numbers that:

$$\frac{1}{N} e_i^{\dagger}T^{\dagger} Te_j \underset {N \to \infty}\to \begin{cases}\left|\mu\right|^2 + \left|\sigma\right|^2 & \text{if $i=j$}\\ \left|\mu\right|^2 & \text{otherwise}\end{cases}$$

Indeed if $i=j$, $$\frac{1}{N} e_i^{\dagger}T^{\dagger} Te_j = \frac1N \sum_{k=1}^{N} T^{*}_{i,k}T_{j,k} \to \mathbb E\left[T_{1,1}^*T_{1,1}\right] = \left|\mu\right|^2 + \left|\sigma\right|^2.$$

Now with that you can prove that $$\frac{1}{N}x^\dagger T^\dagger T y=\sum_{i,j=1}^N x_i^*y_j\frac{1}{N} e_i^{\dagger}T^{\dagger} Te_j \to \left|\mu\right|^2\left(x^\dagger \mathbf 1\right)\left(\mathbf 1^\dagger y \right) + \sigma^2 x^\dagger y$$

Use this property in your Pearson coefficient.