For PCA using the eigenvectors of the covariance matrix, what is the meaning of the eigenvalues?

Question

For PCA using the eigenvectors of the covariance matrix, what is the meaning of the eigenvalues?

639 Views Asked by Caio Amorim At 06 Jun 2025 - 5:12

When doing a PCA using the largest eigenvectors associated with the largest eigenvalues, what does the values of the eigenvalues means?

Example:

The 2 largest eigenvectors of my dataset are these:

1 - [ 6.62257875e-01 -1.63390189e-01  7.31243512e-01 -1.13386505e-04 -9.65364160e-05  1.02781966e-03]

2 - [ 3.31219165e-01 -8.11563370e-01 -4.81309165e-01  4.26282496e-04 3.70709031e-05  2.55801611e-04]

How can I associate these values with the large dispersion of the data in a plot?

Original Q&A

There are 3 best solutions below

parsiad On 29 May 2020 - 7:27

I wrote a blog post that is more detailed than my answer below explaining PCA. It has some pictures that you might also find helpful.

Suppose you are given points $x_{1},\ldots,x_{N}$ in $p$-dimensional (Euclidean) space. Let $\mathbf{x}\sim\operatorname{Unif}(x_{1},\ldots,x_{N})$ be a random vector that takes on one of these points with uniform probability. By definition, the expectation $\mathbb{E}\mathbf{x}$ and variance $\operatorname{Var}(\mathbf{x})$ of this vector are simply the sample mean and sample variance of your points $x_{1},\ldots,x_{N}$. Let's proceed assuming $\mathbb{E}\mathbf{x}=0$ (if this is not true, you can always work with $\mathbf{x}^{\prime}=\mathbf{x}-\mathbb{E}\mathbf{x}$ instead).

The first principal component is defined to be a unit direction $v_{1}$ that maximizes the sample variance of your points: $$ v_{1}=\operatorname{argmax}_{\Vert v\Vert=1}\operatorname{Var}(\mathbf{x}\cdot v). $$ Now, let $X$ be an $N\times p$ matrix $$ X=\begin{pmatrix}x_{1}^{\intercal}\\ \vdots\\ x_{N}^{\intercal} \end{pmatrix} $$ whose rows are your points. Note that $$ \operatorname{Var}(\mathbf{x}\cdot v)=\frac{1}{N}\sum_{i=1}^{N}(x_{i}\cdot v)^{2}=\frac{1}{N}\left\Vert Xv\right\Vert ^{2}=\frac{1}{N}\left(Xv\right)^{\intercal}\left(Xv\right)=\frac{1}{N}v^{\intercal}X^{\intercal}Xv. $$ It follows that the first principal component satisfies $$ v_{1}=\operatorname{argmax}_{\left\Vert v\right\Vert =1}v^{\intercal}X^{\intercal}Xv. $$ We want to apply the method of Lagrange multipliers to solve the above. As such, we define $$ \mathcal{L}(v;\lambda)=v^{\intercal}X^{\intercal}Xv-\lambda\left(\left\Vert v\right\Vert^2 -1\right). $$ The gradient of $\mathcal{L}$ with respect to $v$ is $$ [\nabla_{v}\mathcal{L}](v;\lambda)=2X^{\intercal}Xv-2\lambda v. $$ Setting the gradient to zero, it follows that $v_{1}$ must satisfy $X^{\intercal}Xv_{1}=\lambda v_{1}$. In other words, $v_{1}$ is an eigenvector of $X^{\intercal}X$.

To determine which eigenvector (and consequently, what the eigenvalue $\lambda$ means), we plug $v_{1}$ back into the expression for the sample variance. Using the fact that $X^{\intercal}Xv=\lambda v$ and $v^{\intercal}v=1$, $$ \operatorname{Var}(\mathbf{x}\cdot v_{1})=\frac{1}{N}v_{1}^{\intercal}X^{\intercal}Xv=\frac{\lambda}{N}v_{1}^{\intercal}v_{1}=\frac{\lambda}{N}. $$ Since $v_{1}$ maximizes the sample variance, it follows that $v_{1}$ is an eigenvector associated with the largest eigenvalue $\lambda_{1}$ of the matrix $X^{\intercal}X$ (all eigenvalues of $X^{\intercal}X$ are nonnegative since $X^{\intercal}X$ is positive semidefinite). Moreover, the above tells us that $\lambda_{1}/N$ is variance explained by the first principal component.

In general, letting $\lambda_{k}$ denote the $k$-th eigenvalue of $X^{\intercal}X$, $$ \boxed{\frac{\lambda_{k}}{N}\text{ is the variance explained by the }k\text{-th principal component}} $$ It's important to point out that in practice, people usually talk about $\sigma_{k}=\sqrt{\lambda_{k}}$ instead of talking about $\lambda_{k}$ directly. Due to the connection between PCA and SVD, $\sigma_k$ is called the $k$-th singular value.

ad2004 On 31 May 2020 - 2:31

This matrix based derivation might provide a useful perspective. Consider the following decomposition of the covariance matrix $\Sigma$:

$$ E\left[\mathbf{x}\mathbf{x}^T\right]=\Sigma=Q\Lambda Q^{T} $$

where $E$ is the expectation operator, $\mathbf{x}$ is the data vector, $Q$ is the matrix of orthonormal eigenvectors of $\Sigma$ and $\Lambda$ is the diagonal matrix of the corresponding eigenvalues.

Note for later that $$Q^{T}\Sigma Q=\Lambda $$.

Now, the projections of the data onto the eigenvectors is given by $Q^{T}\mathbf{x}$, so we can ask what is the covariance of these transformed data? We obtain this from

$$ E\left[Q^{T}\mathbf{x}\left(Q^{T}\mathbf{x}\right)^{T} \right] $$

$$ E\left[Q^{T}\mathbf{x}\mathbf{x}^{T}Q \right] $$

$$ Q^{T}E\left[\mathbf{x}\mathbf{x}^{T}\right]Q $$

$$ Q^{T}\Sigma Q=\Lambda $$

Therefore, the eigenvalues of the original covariance matrix (i.e. the entries of the diagonal matrix $\Lambda$) are the variances of the projected data along the eigenvectors. I hope this helps.

**Ben Grossmann** · Accepted Answer

I assume that your data-matrix $X$ is $n\times 6$, so that each row represents a single data-point. I assume that the eigenvalues/eigenvectors that you are referring to are those of the matrix $X^TX$.

For $i=1,2$, let $\lambda_i,v_i$ denote the eigenvalue/eigenvector pairs with $\lambda_1 \geq \lambda_2$; note that your vectors are unit vectors. For each $i$, $\lambda_i$ is the variance of the $v_i$ component of the data points. That is, $\lambda_i$ is the variance of the dot-product $x \cdot v_i$ (among the rows $x$ of $X$). That is, if $\bar x$ denotes the average of the rows (the centroid of the data points), $\bar x \pm \sqrt{\lambda_i} v_i$ gives a standard error bar of the $v_i$ component.

For PCA using the eigenvectors of the covariance matrix, what is the meaning of the eigenvalues?

There are 3 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MATRICES

Related Questions in EIGENVALUES-EIGENVECTORS

Related Questions in PRINCIPAL-COMPONENT-ANALYSIS

Trending Questions

Popular # Hahtags

Popular Questions