Negative values in eigendecomposition when using NumPy

2.3k Views Asked by At

I am trying to verify my solution for a simple problem using numpy.

So we are given a data matrix, $\mathbf{X}$, where each row is a datapoint. We are also given its SVD, $\mathbf{U\Delta V}^T$. We are asked to compute the eigendecomposition of the variance ($\mathbf{\Sigma}=\frac{1}{N}\mathbf{X}^T\mathbf{X}$ with $N$ number of datapoints).

What I've done is:

$$ \mathbf{\Sigma} = \frac{1}{N}\mathbf{X}^T\mathbf{X} = \frac{1}{N}\mathbf{V\Delta U}^T\mathbf{U\Delta V}^T = \frac{1}{N}\mathbf{V\Delta}^2\mathbf{V}^T \implies \mathbf{\Sigma V} = \mathbf{V}\frac{\mathbf{\Delta^2}}{N} $$

So, eigenvectors are the columns of $\mathbf{V}$ while eigenvalues are the elements in the diagonal of $\frac{\mathbf{\Delta^2}}{N}$.

The prolem is when I try to verify this in numpy by running this code:

X = np.random.uniform(1, 20, [10, 10])
U, D, Vt = np.linalg.svd(X)
eigenvalues, eigenvectors = np.linalg.eig(1/10 * X.T@X)

and then comparing 1/10*D**2 with eigenvalues and Vt.T with eigenvectors, some (not all!) of the values in V and the eigenvectors have different sign (but same absolute value). Note that the eigenvalues have the right sign.

Is there an error in my hand-computation, is this some approximation error or is this a propriety of eigenbases I am not aware of?

1

There are 1 best solutions below

0
On BEST ANSWER

If $v$ is an eigenvector of some operator with the corresponding eigenvalue $\lambda$, then for any $k \neq 0$ the vector $kv$ is also an eigenvector with the same eigenvalue $\lambda$. To put it another way, eigenvectors corresponding to $\lambda$ (together with the zero vector) form a linear subspace, — point of view that is especially important when considering degenerate eigenvalues (with geometric multiplicity > 1).

Thus, even if you normalize the eigenvectors, still both $v$ and $-v$ are valid answers for the same $\lambda$. Naturally, you don't have much control over which exactly normalized eigenvector will be computed via a numerical method; yet you always can multiply it by $-1$ if you need to.