how to compute the distance between a matrix and its lower rank approximation?

240 Views Asked by At

I have a matrix $X$ and $Z$ a lower rank approximation of $X$ obtained using only few of the columns of $X$.

I would like to have a measure of how distant are $X$ and $Z$. In particular I would like something similar to what is usually done with PCA as explained here

I need something like $std(X-\hat X)$ where $Projection(\hat X)=Z$ but that at the same time takes into account the correlation between the variables.

Suppose for example : $X=(x_1, x_1, x_2, x_3)$ and $Z = (x_1, x_3)$ then $std(X-\hat X)= std((x_1,x_1,x_2,x_3) - (0,x_1,0,x_3))$ and I want this result to be equal to $std(x_3)$ and not $std(x_1) + std(x_3)$ because the information contained in $x_1$ is already contained in $Z$.

Suppose for example this little python code

X=np.ndarray(shape=(10,20))
Z = X[:,[1,3,5]]

The optimal solution should have a theoretical explanation and a python implementation of the answer

EDIT: of possible interest Sparse PCA

1

There are 1 best solutions below

2
On

If you want to compute the distance between two matrices $X \in \mathbb V$ and $Y \in \mathbb V$ you will need a distance function on the space $\mathbb V$. One way to define such as distance is to use the Frobenius norm $\Vert X \Vert _F = \sqrt{\sum_{i=1}^M \sum_{j=1}^N X_{ij}^2}.$ Thus you will have $d_F(X,Y)=\Vert X - Y\Vert _F$. You can easily verify the required properties of $d_F(.,.)$, i) positivity, ii) triangle inequality iii) symmetry, to be a distance function. Please not that certain matrix spaces, such as positive definite matrices, define a specific geometry. For instance the positive definte matrices form a postive cone. For such manifolds there are more instrinsic distance functions, e.g. S-divergence.