Every symmetric square real matrix has a non-zero real eigenvector.

518 Views Asked by At

Let $A$ be a symmetric $n×n$ real matrix. Then there exists a non-zero real eigenvector for $A$.

The proof(using calculus)in my reference follows:

Let $ A$ be a real symmetric matrix ,and let $f(X)= X^tAX$ be the associated quadratic form.Let P be a point on the unit sphere such that $f(P)$ is a maximum for $f$ on the sphere. Then $P$ is an eigenvector for $A$. In other words, there exists number $λ$ such that $AP=λP$.

Here's the photo of the proof proof from serge lang's introduction to linear algebra

The thing is i am not able to understand the above proof. The curve $C(t)$ in the above proof. what is it?how does it looks like?, i am not able to grasp it. Also the concept of quadratic form and maximising it in a unit sphere.

Please if you could explain the above proof in a easier way or provide a alternative proof for the above theorem.

It would be better if you could provide a graphical view of the above proof.

3

There are 3 best solutions below

1
On BEST ANSWER

I had some trouble before I understood the printed proof, and I am putting this as an answer in the hope it will help you understand the printed proof. It is really more a comment, but is far too long for that. I hope people will forgive me.

The quadratic form in 2-d is $ax^2+bxy+cy^2$, and you can get this from $X^TAX$ if X is a 2-d column vector. The expression gets longer quite quickly in 3 or more dimensions. The value of the quadratic form will vary across the unit sphere, and $P$ is the point where its value is maximum. The space $W$ is the tangent space (think of a plane touching a sphere as drawn) and its dimension is 1 less than $n$, the dimension of the containing space. $P$ is also used as the (unit) vector to the point P, and with $w$ as a unit vector in $W$, $C(t)$ is defined. For each $t$, $C(t)$ is a point, and the points form a circle.

He proves the circle lies in the unit sphere. Unfortunately he says the direction of the curve "is perpendicular to the sphere at $P$", where I think he means perpendicular to the radius at $P$.

In the function $g(t) = f(C(t)) = C(t).AC(t)$, $C(t)$ is a column vector and $A$ is a matrix - the typeface does not make this clear.

$C.AC'=C^TAC'=C^TA^TC'=(AC)^TC'=AC.C'=C'.AC$ expands the reasoning in the last line of the first page.

On the second page, in $AP$ $P$ is again a column vector (of the point $P$) and A the transformation matrix. Finally the transformation of $P$ is perpendicular to the tangent space $W$, so it must be a multiple of $P$ the radius vector.

6
On

Another appproach : from the Spectral Theorem we have $f \in End(V)$ is diagonalizable with a basis that is orthonormal for $\phi$ and of eigenvectors for $f$ if and only if $f$ is self-adjoint.

As obvious consequence we have that if $A$ is a symmetric matrix, $A \in \mathcal{S}(n,\mathbb{R}) \hspace{0.2cm} \exists P \in O(n) : P^{-1}AP = P^{t}AP = D$, where $D$ is diagonal.

Since $A$ is conjugated by a orthogonal matrix to a diagonal matrix in particul we have that the elements on the diagonal are the eigenvalues of $A$, since we proved that exists a basis this translates into : $\exists v \ne 0 : Av = \lambda v$.


Edit :

In this case is sufficient prove the following :

Lemma : If $f = f^{*} \Rightarrow sp(f) \ne \varnothing$.

Otherwise said $\exists \hspace{0.1cm} \lambda \in \mathbb{K}, v \ne 0 \in V : f(v) = \lambda v$

Proof : Let's fix an orthonormal basis for $f = f^{*}$ (I can do this for example by picking the standard inner product). If i think of $f$, represented by the matrix $A$ an endomorphism of $\mathbb{C}^{n}$ certainly, since the characteristical polynomial is completly factorizable $\exists \lambda \in \mathbb{C},z \ne 0 : Az = \lambda z$. We want to prove that $\lambda \in \mathbb{R}$ and we have done.

The idea is to 'complexify' $\phi$.

We define the Hermitian product $\langle z,w \rangle_{\mathbb{C}} := z^{t} I \overline{w}$.

(Were the bar denoted the conjugated)

Notice that $\langle z,z \rangle_{\mathbb{C}} = z^{t}\overline{z} \in \mathbb{R}$. Now we have from one side:

$$\langle z,z \rangle_{\mathbb{C}} = z^{t}\overline{Az} = z^{t}(\overline{\lambda z}) = \overline{\lambda}z^{t}\overline{z}$$

To the other, since $A^{t} = A$ and $A \in M(n,\mathbb{R})$ :

$$\langle z,z \rangle_{\mathbb{C}} = z^{t}A\overline{z} = z^{t}A^{t}\overline{z} = (Az)^{t}\overline{z} = (\lambda z)^{t}\overline{z} = \lambda z^{t}\overline{z}$$

Hence if you compare what we found we have $\overline{\lambda}z^{t}\overline{z} = \lambda z^{t}z$, hence $\lambda \in \mathbb{R}$ (why ?)

So the state follows.

0
On

Usually (in my experience), this is rather proved using Lagrange multipliers. By the max/min theorem, the function $f(\mathbf{x}) = \mathbf{x}^\mathrm{T} A \mathbf{x}$ at some point $\mathbf{p}$ in the compact set of points $\mathbf{x}$ such that $g(\mathbf{x})=1$, for $g(\mathbf{x}) = \mathbf{x}^\mathrm{T} \mathbf{x}$, attains its maximum on that set. Hence at that maximum we have $\nabla f(\mathbf{p}) = \lambda \nabla g(\mathbf{p})$ for some scalar $\lambda \in \mathbb{R}$ (the Lagrange multiplier). Since $\nabla g(\mathbf{x}) = 2\mathbf{x}$ and $\nabla f(\mathbf{x}) = 2A\mathbf{x}$, that equality is in fact asserting that $\mathbf{p}$ is an eigenvector of $A$ (and $\lambda$ is the corresponding eigenvalue)!

If you inline a proof of the method of Lagrange multipliers (that considers test curves through the critical point, rather than one which uses the Lagrangian function) into that argument, you may end up with something similar to that textbook argument, but even then it feels a bit convoluted. It could however be that the author wished to avoid using calculus of several variables in the proof (instead doing all the calculus with one variable $t$ and the several variable part as static linear algebra).