PCA covariance matrix number of total data points

12 Views Asked by At

I have a small question about PCA, specifically in calculating the covariance matrix. I know that to calculate the covariance matrix $C$, I have to subtract the mean from the data points and form the data matrix $X$, and thus I have

$$ C = \frac{1}{n}XX^T $$

And here comes the question, what if there are data points that are the same, e.g. there are 2 identical data points, should I subtract 1 from the total number of data points, $n$?

Also, in this case, does the data matrix $X$ need to change? exclude the data point

Thank you