How do covariance matrix $C = AA^T$ justify the actual $A^TA$ in PCA?

2.4k Views Asked by At

For computational efficiency in PCA, evaluation of covariance matrix is done as $C = AA^T$ instead of original $C=A^TA$. How can that be justified? What steps are needed then to recover the original information? Although I can find the beauty and idea of PCA algorithm, still having hard time understanding some concepts.

Note: Here $C$ is the covariance matrix and $A$ is the set of data vectors represented in a matrix.

1

There are 1 best solutions below

4
On

Why use these matrix product at all?

Just compute the SVD, using Golub-Kahan or better,

$$A=USV^T,$$

$S$ diagonal, with non-negative diagonal, $U,V$ orthogonal, then you automatically also know the eigenvector decompositions

$$AA^T=U(SS^T)U^T\text{ and }A^TA=V(S^TS)V^T.$$


Representing the matrix as $A=\sum_{k=1}^r σ_k u_kv_k^T$ also allows you to see how the conversion that you asked about works. If $AA^T$ is diagonalized, the result can be written as $AA^T=\sum_{k=1}^r σ_k^2 u_ku_k^T$, the pairs $(λ_k=σ_k^2, u_k)$ are the result of the diagonalization. Since the vectors $u_k$ are orthonormal, one gets $A^Tu_k=σ_k v_k$, so that in turn $v_k=\frac1{\sqrt{λ_k}}A^Tu_k$, where the $v_k$ are the orthonormal eigenvectors of $A^TA$.