How to interpret the diagonalization of a scatter matrix in Principal Component Analysis (PCA)?

55 Views Asked by At

Let $X \in \mathbb{R}^{2 \times n}$ be the data matrix, $S_n = (X - \hat{\mu})(X - \hat{\mu})^T$ be the scatter matrix, and

$$ S_n = \begin{array}{cc} \begin{bmatrix} 3 & -4 \\ -4 & 3 \end{bmatrix} & \begin{bmatrix} 0.63 & 0\\ 0 & 2.74 \end{bmatrix} & \begin{bmatrix} -\frac{3}{7} & -\frac{4}{7} \\ -\frac{4}{7} & -\frac{3}{7} \end{bmatrix} \end{array} \quad \text{be its diagonalization.} $$

I have been asked about the first principle component of the data according to PCA.

Could someone please help me interpret this diagonalization in the context of Principal Component Analysis (PCA)? Specifically, I would like to understand how to derive the eigenvectors and eigenvalues from the diagonalized matrix and what they represent in the PCA analysis.

Any insights or explanations would be greatly appreciated.