reconstructed error for PCA-analysis not equal to zero?

72 Views Asked by At

I am working on an assignment for school where they ask us to perform PCA-analysis on a data set consisting of 500 data points where each data point is of dimension $p=256$. You usually project your data set on the vector space spanned by the $q$ eigenvectors corresponding to the $q$ largest eigenvalues, so your $q$ of choice should be smaller than $p$. However, if I take $q=256$ and perform PCA to reconstruct the original mean of the data and then compute the RMSE between the original mean and the reconstructed mean , I obtain that the RMSE equals 0.3063. Why is this the case? Wouldn't you expect the RMSE to be equal to zero if you take $q=p$?

1

There are 1 best solutions below

0
On

Theres a chance I'm in the same class as you, because I also have an assignment where I was getting a RMSE of about 0.3. You should get the RMSE as 0, the mistake I made had to do with the reconstructing of the data. Make sure the dataset X has been scaled to have a mean of 0. Then when you perform the reconstruction you are using this scaled dataset. I forgot to do this and so when I reconstructed I added the mean which put my reconstruction off. This article helped me a lot: https://towardsdatascience.com/principal-components-analysis-explained-53f0639b2781