I have a matrix $X$ and $Z$ a lower rank approximation of $X$ obtained using only few of the columns of $X$.
I would like to have a measure of how distant are $X$ and $Z$. In particular I would like something similar to what is usually done with PCA as explained here
I need something like $std(X-\hat X)$ where $Projection(\hat X)=Z$ but that at the same time takes into account the correlation between the variables.
Suppose for example : $X=(x_1, x_1, x_2, x_3)$ and $Z = (x_1, x_3)$ then $std(X-\hat X)= std((x_1,x_1,x_2,x_3) - (0,x_1,0,x_3))$ and I want this result to be equal to $std(x_3)$ and not $std(x_1) + std(x_3)$ because the information contained in $x_1$ is already contained in $Z$.
Suppose for example this little python code
X=np.ndarray(shape=(10,20))
Z = X[:,[1,3,5]]
The optimal solution should have a theoretical explanation and a python implementation of the answer
EDIT: of possible interest Sparse PCA
If you want to compute the distance between two matrices $X \in \mathbb V$ and $Y \in \mathbb V$ you will need a distance function on the space $\mathbb V$. One way to define such as distance is to use the Frobenius norm $\Vert X \Vert _F = \sqrt{\sum_{i=1}^M \sum_{j=1}^N X_{ij}^2}.$ Thus you will have $d_F(X,Y)=\Vert X - Y\Vert _F$. You can easily verify the required properties of $d_F(.,.)$, i) positivity, ii) triangle inequality iii) symmetry, to be a distance function. Please not that certain matrix spaces, such as positive definite matrices, define a specific geometry. For instance the positive definte matrices form a postive cone. For such manifolds there are more instrinsic distance functions, e.g. S-divergence.