Informative proof that any real-valued symmetric matrix only has real eigenvalues

421 Views Asked by At

I am looking for an informative proof that any real-valued symmetric matrix only has real eigenvalues. By informative, I mean that there is an explanation accompanying the proof, rather than just a copy-and-paste job, which is not informative.

I came across this question, but (1) the top-rated answer by Lepidopterist is, according to himself, not a proof of the result, but rather an explanation of why the displayed method is not a proof of the desired result, and (2) none of the proofs posted offer explanations and are just copy-and-paste jobs. Also, none of the answers in the question have been accepted by the author, so it seems that they might have also found the answers to be unsatisfactory.

I'm seeking a proof and an accompanying explanation, so that I can properly learn the reasoning behind how any real-valued symmetric matrix only has real eigenvalues.

I would greatly appreciate it if someone could please take the time to clarify this.

3

There are 3 best solutions below

0
On

I found the proof in http://pi.math.cornell.edu/~jerison/math2940/real-eigenvalues.pdf to be informative and educational.

The Spectral Theorem states that if $A$ is an $n \times n$ symmetric matrix with real entries, then it has $n$ orthogonal eigenvectors. The first step of the proof is to show that all the roots of the characteristic polynomial of $A$ (i.e. the eigenvalues of $A$) are real numbers.

Recall that if $z = a + bi$ is a complex number, its complex conjugate is defined by $\bar{z} = a − bi$. We have $z \bar{z} = (a + bi)(a − bi) = a^2 + b^2$, so $z\bar{z}$ is always a nonnegative real number (and equals $0$ only when $z = 0$). It is also true that if $w$, $z$ are complex numbers, then $\overline{wz} = \bar{w}\bar{z}$.

Let $\mathbf{v}$ be a vector whose entries are allowed to be complex. It is no longer true that $\mathbf{v} \cdot \mathbf{v} \ge 0$ with equality only when $\mathbf{v} = \mathbf{0}$. For example,

$$\begin{bmatrix} 1 \\ i \end{bmatrix} \cdot \begin{bmatrix} 1 \\ i \end{bmatrix} = 1 + i^2 = 0$$

However, if $\bar{\mathbf{v}}$ is the complex conjugate of $\mathbf{v}$, it is true that $\mathbf{v} \cdot \mathbf{v} \ge 0$ with equality only when $\mathbf{v} = 0$. Indeed,

$$\begin{bmatrix} a_1 - b_1 i \\ a_2 - b_2 i \\ \dots \\ a_n - b_n i \end{bmatrix} \cdot \begin{bmatrix} a_1 + b_1 i \\ a_2 + b_2 i \\ \dots \\ a_n + b_n i \end{bmatrix} = (a_1^2 + b_1^2) + (a_2^2 + b_2^2) + \dots + (a_n^2 + b_n^2)$$

which is always nonnegative and equals zero only when all the entries $a_i$ and $b_i$ are zero.

With this in mind, suppose that $\lambda$ is a (possibly complex) eigenvalue of the real symmetric matrix $A$. Thus there is a nonzero vector $\mathbf{v}$, also with complex entries, such that $A\mathbf{v} = \lambda \mathbf{v}$. By taking the complex conjugate of both sides, and noting that $A = A$ since $A$ has real entries, we get $\overline{A\mathbf{v}} = \overline{\lambda \mathbf{v}} \Rightarrow A \overline{\mathbf{v}} = \overline{\lambda} \overline{\mathbf{v}}$. Then, using that $A^T = A$,

$$\overline{\mathbf{v}}^T A \mathbf{v} = \overline{\mathbf{v}}^T(A \mathbf{v}) = \overline{\mathbf{v}}^T(\lambda \mathbf{v}) = \lambda(\overline{\mathbf{v}} \cdot \mathbf{v}),$$

$$\overline{\mathbf{v}}^T A \mathbf{v} = (A \overline{\mathbf{v}})^T \mathbf{v} = (\overline{\lambda} \overline{\mathbf{v}})^T \mathbf{v} = \overline{\lambda}(\overline{\mathbf{v}} \cdot \mathbf{v}).$$

Since $\mathbf{v} \not= \mathbf{0}$,we have $\overline{\mathbf{v}} \cdot \mathbf{v} \not= 0$. Thus $\lambda = \overline{\lambda}$, which means $\lambda \in \mathbb{R}$.

For further information on how the author gets from $\overline{\mathbf{v}}^T(\lambda \mathbf{v})$ to $\lambda(\overline{\mathbf{v}} \cdot \mathbf{v})$ and from $(\overline{\lambda} \overline{\mathbf{v}})^T \mathbf{v}$ to $\overline{\lambda}(\overline{\mathbf{v}} \cdot \mathbf{v})$, see this question.

1
On

Let $A$ be a complex square matrix. In general, for any vectors $v$ and $w$, we know that $$\langle Av,w\rangle = \langle v,A^*w\rangle,$$ where $A^*$ is the conjugate transpose of $A$.

Now assume that $A$ is real and symmetric, and $v$ is an eigenvector of $A$ with eigenvalue $\lambda$. We then have: $$\lambda\langle v,v\rangle = \langle \lambda v,v\rangle = \langle Av,v\rangle = \langle v,A^*v\rangle = \langle v,Av\rangle = \langle v,\lambda v\rangle = \overline{\lambda}\langle v,v\rangle.$$ Therefore, $\lambda\langle v,v\rangle = \overline{\lambda}\langle v,v\rangle$. Since $v$ is an eigenvector of $A$, $v\neq \mathbf{0}$, hence $\langle v,v\rangle\neq 0$. Thus, we have that $\lambda = \overline{\lambda}$, proving that $\lambda$ must in fact be a real number.

This in general is why Hermitian operators (operators that are equal to their adjoints) have only real eigenvalues.

0
On

$\newcommand{\metric}[2]{\langle #1,#2 \rangle}$$\newcommand{\R}{\mathbb{R}}$Here's another proof. The argument is an adjustment of the proof of the singular value decomposition of a matrix. I don't claim it's the shortest proof ever, but I think it quite informative.

The geometric viewpoint. A symmetric $n\times n$-matrix $A$ gives rise to a bilinear form on $\mathbb{R}^n$: $$ \metric{v}{w} = w^T A v. $$ Associated with this form is the quadratic form $Q(v)=\metric{v}{v}=v^T A v$.

Consider this bilinear form as a generalized inner product on $\R^n$. In this sense, think of $\metric{v}{v}$ as the "squared length" of a vector $v\in\R^n$. If all eigenvalues of $A$ are positive, then $\metric{v}{v}$ is indeed always a positive number; in this case $\metric{v}{w}$ is an inner product. If $A$ has negative of zero eigenvalues, this viewpoint is less intuitive. Nevertheless, such bilinear forms do occur naturally (e.g. the Minkowski spacetime on $\R^4$).

A natural question arises. How do the "distances" behave on $\R^n$ equipped with this bilinear form? For this, we investigate the map $$ f: S^{n-1}\subset\R^n\to \R: v \mapsto v^T A v. $$ We are going to proceed as follows. First we determine the vector $v_1 \in S^{n-1}$ such that $f(v_1)$ is maximal. Then we are going to find a vector $v_2 \in S^{n-1} \cap \mathrm{span}\{v_1\}^\perp$ such that $f(v_2)$ is maximal (on the orthogonal complement of $v_1$). By repeating this argument we get an orthonormal basis $\{v_1,\ldots, v_n\}$ with eigenvalues $f(v_1),\ldots, f(v_n)$. Since the function is real valued, the eigenvalues are real.

Theorem. The matrix $A$ admits an orthonormal basis $\{v_1,\ldots, v_n\}$ of eigenvectors with real eigenvalues.

Proof. Basis step. Since $S^{n-1}$ is compact, there is a $v_1\in S^{n-1}$ such that $f(v_1)$ is maximal. Now take a $w \in S^{n-1}\cap \mathrm{span}\{v_1\}^\perp$. Consider the curve $$ \alpha\colon (-\epsilon,\epsilon)\to\R^n : t \mapsto \cos t\, v_1 + \sin t\, w. $$ Note that $\alpha(0)=v_1$ and $\alpha'(0)=w$. Now consider the composition $g(t)=f(\alpha(t))$. Its derivative is $$ \begin{align*} g'(t) &= (-\sin t\, v_1 + \cos t \, w)^T A (\cos t\, v_1 + \sin t\,w) \\ & \qquad + (\cos t\, v_1 + \sin t\,w)^T A (-\sin t\, v_1 + \cos t \, w) \\ &= 2 (-\sin t\, v_1 + \cos t \, w)^T A (\cos t\, v_1 + \sin t\,w) . \end{align*} $$ Here we used the symmetry of $A$. Note that $g$ is maximal in $t=0$ since $\alpha(0)=v_1$. Therefore $g'(0)=2 w^T A v_1$ must be zero. Since $w$ is an arbitrary unit vector perpendicular to $v_1$, $Av_1$ must be a multiple of $v_1$. We conclude that $v_1$ is an eigenvalue with real eigenvalue $f(v_1)$.

Induction step. Suppose we already found an orthonormal set $\{v_1,\ldots, v_k\}$ of eigenvalues with real eigenvalues. Then we just do the same argument as above, but we apply it to the function $f$ restricted to the subsphere $S^{n-1}\cap \mathrm{span}\{v_1,\ldots,v_k\}^\perp$.