How to connect the definition of eigenvectors from linear algebra to their definition in Machine Learning

Question

How to connect the definition of eigenvectors from linear algebra to their definition in Machine Learning

120 Views Asked by Bumbble Comm At 03 Apr 2026 - 6:53

In Linear algebra the eigenvectors of a matrixs are these vectors that don't change their direction after applying this matrix( as a Transformation) to the space. But In machine learning (PCA to be specific) the eigenvectors of a matrix are the directions of the maximum variance of the data points in this matrix. I can't connect the two ideas, how can i tell that the direction that doesn't change is the direction of the maximum variance?.

Can i say that the eigenvectors have 2 properties: they have the same direction after applying the transformation, and they also describe the direction of the maximum variance?

Thanks

Original Q&A

There are 3 best solutions below

Bumbble Comm On 15 Mar 2016 - 8:12

The principal components in PCA are the linear-algebra eigenvectors of a particular matrix. It is not a different definition of eigenvalues or eigenvectors.

That the components in PCA, defined as solutions of a sequence of optimization problems, coincide with eigenvectors of the covariance matrix is an application of the theory of Rayleigh quotients. See https://en.wikipedia.org/wiki/Min-max_theorem .

Bumbble Comm On 15 Mar 2016 - 8:25

The co-variance can be thought of as defining an ellipsoid. The eigenvectors of the co-variance matrix correspond to the axes of the ellipsoids, and the eigenvalues are the lengths of the axes.

**Bumbble Comm** · Accepted Answer

PCA seeks the linear combination of your variables that has maximum variance, and the assertion is that the coefficients of that best combination constitute an eigenvector for the covariance matrix of the variables. (We assume the coefficients are normalized so that the sum of squares equals 1.)

You can see this in the two-dimensional case: Say you've observed a data set with two variables $x$ and $y$, and suppose the correlation between $x$ and $y$ is $\rho$. For simplicity we can assume the $x$ and $y$ variables each have variance $1$. Then the covariance matrix of $x,y$ is $$ \Sigma:=\begin{pmatrix}1&\rho\\\rho &1\end{pmatrix} $$ and the variance of the linear combination $ax+by$ is $$ \operatorname{var}(ax+by)=a^2\operatorname{var}x+b^2\operatorname{var}y + 2ab\operatorname{cov}(x,y)=a^2+b^2+2\rho ab.\tag1 $$ Suppose we want to maximize (1) over all $a,b$ subject to the constraint that $a^2+b^2=1$. Using Lagrange multipliers, we form the objective function $$ L(a,b;\lambda):= a^2+b^2+2\rho ab+\lambda(a^2+b^2-1) $$ and take partials with respect to $a, b, \lambda$: $$ {\partial L\over\partial a}=2(a+\rho b -\lambda a)\\ {\partial L\over\partial b}=2(b+\rho a -\lambda b)\\ {\partial L\over\partial \lambda}=a^2+b^2-1 $$ The maximum occurs where these partials are zero. Setting the first two of these to zero and rearranging into matrix form, we get: $$ \begin{pmatrix}1-\lambda &\rho\\\rho&1-\lambda\end{pmatrix} \begin{pmatrix}a\\b\end{pmatrix}= \begin{pmatrix}0\\0\end{pmatrix}, $$ or $$ (\Sigma-\lambda I){\bf v}={\bf 0}\tag2 $$ where we write ${\bf v}$ for the column vector $(a,b)^T$. But (2) is saying that $(a,b)^T$ is an eigenvector for the covariance matrix $\Sigma$ corresponding to eigenvalue $\lambda$.

Note that the covariance matrix has two eigenvalues. The corresponding eigenvectors represent the linear combinations with maximum and minimum variance.

How to connect the definition of eigenvectors from linear algebra to their definition in Machine Learning

There are 3 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in STATISTICS

Related Questions in EIGENVALUES-EIGENVECTORS

Related Questions in NUMERICAL-LINEAR-ALGEBRA

Related Questions in COVARIANCE

Trending Questions

Popular # Hahtags

Popular Questions