Decorrelating variables using Cholesky decomposition

3.6k Views Asked by At

I am looking for a method to decorrelate several variables, so that their covariance matrix is diagonal, while keeping the original mean for each of them.

I found this old article which seemed pretty much what I was looking for: http://blogs.sas.com/content/iml/2012/02/08/use-the-cholesky-transformation-to-correlate-and-uncorrelate-variables.html

As I understood it:

  1. I should pick a chosen covariance matrix Cov (I chose the correlation matrix to keep the mean and standard deviation, but I am not sure it is needed)
  2. I should compute its $LL^{t}$ Cholesky decomposition matrix L(Cov)
  3. I should multiply my random variable matrix A by L(Cov) to obtain the product $B = A * L(Cov)$.
  4. The Covariance Matrix of B, Cov(B), should be equal to Cov while the mean of the modified random variables B should remain the same as the mean of the random variables of A. According to my tests, it is not.

In particular, if I used $Cov=Cov(A)^{-1}$ as input, I should obtain $L(Cov)=L(Cov(A))^{-1}$ and $Cov(B)$ should be decorrelated thus diagonal. According to my tests, it is not.

At the same time, I am confused since multiplying A by $L(I)$ with I the identity matrix used as the initial value of Cov cannot cause $Cov(B)$ to become the identity matrix if it was not initially.

I tried both with randomly distributed and normally distributed samples, I was not able to reproduce results similar to the article's and lost the means.

Did I do something wrong in my use of Cholesky decomposition to decorrelate variables ? Or if it is not supposed to work, what simple method would you advice me ?

1

There are 1 best solutions below

2
On BEST ANSWER

It seems the article I quoted describes the "Whitening transformation" shown here: https://en.wikipedia.org/wiki/Whitening_transformation. Thus, I should center and reduce my variables before multiplying A by $L(Cov(A))^{−1}$. I don't get yet why this step is needed but I'll try this way.

After that, I may multiply by the original standard deviation and add the mean to recover decorrelated variables.

I'll update wether it works or not.