Clarifying the constraints used in deriving the Principal Components of PCA

112 Views Asked by At

In studying principal components analysis, I am confused by one point.

For a set of $N$ (zero-centered) data points of dimension $m$, projected to a dimension $k < m$, we want a set of vectors of dimension $m$, say $w_{i}$, that maximises the variance along each projection.

What I don't understand is, do we start with this idea, and then conclude that the vectors $w_i$ must be orthogonal after some derivations? Maybe the $w_i$ initially must be linearly independent?

Or rather, is it an original constraint of principal components analysis that the $w_i$ must be mutually orthogonal?

1

There are 1 best solutions below

1
On BEST ANSWER

It is the original constraint that the $w_i$ must be orthogonal.

The first principal component is required to have the largest possible variance. The second component is computed under the constraint of being orthogonal to the first component and to have the largest possible variance.

It is quite intuitive, in the sense that if $w_2$ is linearly independent of $w_1$ but not orthogonal, then it carries redundant information that we already have via $w_1$.

Reference: Principal Component Analysis