Why do we scale by $\frac{1}{N-1}$ while calculating the covariance matrix in PCA?

502 Views Asked by Ranveer At 06 Jun 2025 - 6:29

When we perform the Principal Components Analysis (PCA) on a set of N d-dimensional vectors, we scale by a factor of $\frac{1}{N-1}$.
Here's what we do in PCA:

We calculate the mean of all the d-dimensional vectors
We subtract each vector by its mean
We calculate the covariance matrix:
$C = \frac{1}{N-1}\sum_{i=1}^{N}(x_i-\bar{x})\cdot(x_i-\bar{x})^T$

and so on.
In the third step, why do we multiply by a factor of $N-1$?

Original Q&A

There are 1 best solutions below

Ian On 12 Nov 2015 - 10:54

It is exactly the same reason why the standard estimator for the variance divides by $N-1$ instead of $N$. If you divide by $N$ instead, the estimator you get is biased, as you can see by directly calculating its expected value.

There are various intuitive ways to explain this. One is that by replacing $\mu$ with $\overline{x}$, you "lose a degree of freedom", since the difference between the values and the sample means are not quite independent. This perspective is useful in a number of similar contexts where you have to subtract something from the sample size to get the appropriate estimator.

Another is to think about the extreme case $N=1$: in this case, if you divide by $N$, you will always get zero, which is clearly not a good estimate for the population variance in general.

Why do we scale by $\frac{1}{N-1}$ while calculating the covariance matrix in PCA?

There are 1 best solutions below

Related Questions in ALGORITHMS

Related Questions in VECTORS

Related Questions in COVARIANCE

Related Questions in IMAGE-PROCESSING

Trending Questions

Popular # Hahtags

Popular Questions