Problem with Spectral Theorem Proof

299 Views Asked by At

Claim: Let A $\in \mathbb{R}^{n \times n }$ be symmetric. Then there is an orthonormal basis of $\mathbb{R}^n$ consisting of eigenvectors of A.


Sketch of proof:

Induction on n. Claim is clear for n = 1. Let $\lambda$ be an eigenvalue of A, $x$ the associated eigenvector. Let $\{x, v_2, \cdots, v_n \}$ be a basis for $\mathbb{R}^n$.

Gram-Schmidt, then we get an orthonormal basis of $\mathbb{R}^n$, $\{b_1, b_2, \cdots, b_n \}$.

We claim that V = span$\{b_2, b_3, \cdots, b_n \}$ is an invariant subspace under A. (Proof omitted)

Then we define T: $V \to V$ to be $Tv = Av$.

Then we take B $\in \mathbb{R}^{(n-1)\times(n-1)}$ as the the matrix representation of T with respect to $\{b_2, \cdots, b_n \}$ and show that B is symmetrical.

By induction hypothesis, we have an orthonormal basis of $\mathbb{R}^{n-1}$ consisting of eigenvectors of B. Let $\{ u_2, u_3, \cdots, u_n\}$ denote this basis.

We take $w_j = \Sigma_{k=2}^n u_{jk}b_k$ for $j = 2, 3 \cdots, n$ where $u_j = (u_{j2}, u_{j3}, \cdots, u_{jn} ) \in \mathbb{R}^{n-1}$.

Then $\{b_1, w_2, \cdots, w_n\}$ is an orthonormal basis for $\mathbb{R}^n$ consisting of eigenvectors of A.


The parts I am having trouble with are:

  • The need for us to show that V is invariant under A (I'm kind of guessing it's to do with us being able to represent T like we did, but would like clarification)
  • (My main problem:) Why our selection of $w_j$ satisfies our claim.

Any help would be much appreciated, I've been going a bit crazy going through the details. Thanks!

2

There are 2 best solutions below

2
On BEST ANSWER

I think this is a very good example where the slightly more abstract proof using linear maps (which you use in any case) instead of matrices simplifies the argument and shows more clearly what is going on.

Let $(W, \left< \cdot, \cdot \right>)$ be a finite dimensional real inner product space and let $T \colon W \rightarrow W$ be a symmetric operator (so $T = T^{*}$). We want to show that $T$ has an orthonormal basis of eigenvectors by induction on $\dim W$. We have two steps:

  1. First, we must show that any symmetric operator has at least one eigenvalue. This is already non-trivial and requires some argument passing through the complex numbers or alternatively a topological argument.
  2. Then we perform the induction. The case $\dim W = 1$ is clear. Where $\dim W > 1$, we take some unit-length eigenvector $v_1 \in V$ and consider $V = \{ v_1 \}^{\perp}$. We show that $V$ is $T$-invariant (that is, $T(V) \subseteq V$) which implies that we can restrict $T$ to $V$ and obtain an operator (a linear map from a vector space to itself and not to a different vector space). The operator $T|_{V} \colon (V, \left< \cdot, \cdot \right>|_{V \times V}) \rightarrow (V, \left< \cdot, \cdot \right>|_{V \times V})$ is also self-adjoint and $\dim V = \dim W - 1$ and so by induction hypothesis there is an orthonormal basis $v_2, \dots, v_n$ of eigenvectors of $T|_V$. Then the basis $(v_1, \ldots, v_n)$ is then an orthonormal basis of eigenvectors of $T$ and you don't need to take any linear combination of the $v_i$.

Regarding your questions,

  1. You need to show that $V$ is $T$-invariant (or $A$-invariant in your case) in order to be able to restrict $T$ to $V$ and obtain an operator. If you don't show that $V$ is $A$-invariant, you can still define $T$ as $T \colon V \rightarrow \mathbb{R}^n$ but then $T$ won't be represented by a square $(n-1)\times(n-1)$ matrix but by a $n \times (n-1)$ matrix and you won't be able to use the induction hypothesis.

  2. Denote by $\mathcal{B} = (b_1, \ldots, b_n)$ and by $\mathcal{C} = (e_1, \ldots, e_n)$ the standard basis. Consider the operator $T \colon \mathbb{R}^n \rightarrow \mathbb{R}^n$ defined by $Tv = Av$. The matrix $A$ represents the operator $T$ with respect to $\mathcal{C}$ while the matrix $B$ represents the action of the operator $T|_V$ with respect to the different basis $\hat{\mathcal{B}} = (b_2, \dots, b_n)$.

    An eigenvector $u_i$ of $B$ (an element of $\mathbb{R}^{n-1}$) can't be an eigenvector of $A$ (an element of $\mathbb{R}^n$) but it represents an eigenvector of $A$ in the sense that $u_i$ consists of the coefficients of an eigenvector $w_i$ of $A$ with respect to the basis $\hat{B}$ (the coefficients of $w_i$ don't have a $b_1$ component since $w_i \in \{ b_1 \}^{\perp}$).


Addition: Let $\mathbf{u} = (u_2, \ldots, u_n) \in \mathbb{R}^{n-1}$ be an eigenvector of $B$ with $B\mathbf{u} = \lambda \mathbf{u}$ and set $\mathbf{w} = \sum_{i=2}^n u_i \mathbf{b}_i$. We want to show that $A\mathbf{w} = \lambda \mathbf{w}$. Since $(\mathbf{b}_1, \ldots, \mathbf{b}_n)$ is an orthonormal basis of $\mathbb{R}^n$, it is enough to show that

$$ \left< A \mathbf{w}, \mathbf{b}_j \right> = \left< \lambda \mathbf{w}, \mathbf{b}_j \right> = \begin{cases} \lambda_j u_j & 2 \leq j \leq n \\ 0 & j = 1 \end{cases}.$$

Since $V$ is $A$-invariant and $\mathbf{w} \in V$ we have $\left<A\mathbf{w}, \mathbf{b}_1 \right> = 0$. For $2 \leq j \leq n$ we have

$$ \left< A \mathbf{w}, \mathbf{b}_j \right> = \left< \sum_{i=2}^n u_i A\mathbf{b}_i, \mathbf{b}_j \right> = \sum_{i=2}^n \left< A \mathbf{b}_i, \mathbf{b}_j \right> u_i = \sum_{i=2}^n b_{ij} u_i = \sum_{i=2}^n b_{ji} u_i = (B \mathbf{u})_j = (\lambda \mathbf{u})_j = \lambda_j \mathbf{u}_j $$

where we used the fact that $b_{ij} = \left< A\mathbf{b}_i, \mathbf{b}_j \right>$ and that $B$ is symmetric.

8
On

$V$ needs to be an invariant subspace under $A$ so that $T$ goes from $V$ to $V$. Otherwise, if $V$ was not invariant, $T$ would go from $V$ to $\Bbb{R}^n$ and then it would not have a matrix representation in $\Bbb{R}^{(n-1)x(n-1)}$.


Here's a proof that $w_j$ is an eigenvector of $A$:

$$Bu_j=\lambda_ju_j$$

This is because $u_j$ is an eigenvector of $B$. Now, let $f$ be the function from $V$ to $\Bbb{R}^{n-1}$ that maps a vector in $V$ to its vector representation in $\Bbb{R}^{n-1}$ with respect to the basis $\{b_2, b_3, ..., b_n\}$. (Look at Definition VR on this page.) Take the $f^{-1}$ of both sides.

$$f^{-1}(Bu_j)=f^{-1}(\lambda_ju_j)$$

Let $w_j=f^{-1}(u_j)$ so that $u_j=f(w_j)$. On the right-hand side, we can bring $\lambda_j$ out because $f^{-1}$ is a linear transformation, so we have $\lambda_jf^{-1}(u_j)=\lambda_jw_j$. On the left-hand side, substitute $u_j=f(w_j)$.

$$f^{-1}(Bf(w_j))=\lambda_jw_j$$

Now, look at Theorem FTMR on this page. In our case, $T$ goes from $V$ to $V$, so as per the theorem's variable names, $B=C=\{b_2, b_3, ..., b_n\}$ and $p_B=p_C=f$. Thus, the left-hand side is simply $T(w_j)$ which is the same as $Aw_j$:

$$Aw_j=\lambda_jw_j$$

We have proven $w_j$ is an eigenvector, but we didn't use the same definition as your book. We need to prove these definitions are equivalent. Now, go back to Definition VR from above. According to that and the fact that, as per the definition's variable names, $f=p_B$ and that $B=\{b_2, b_3, ..., b_n\}$, we have:

$$w_j=\sum_{k=1}^{n-1} f(w_j)_kb_{k+1}$$

Now, what's $f(w_j)$? It's $u_j$, as we said above:

$$w_j=\sum_{k=1}^{n-1} u_{jk} b_{k+1}$$

Now, redefine $u_j=(u_{j2}, u_{j3}, ..., u_{jn})$ as you did and adjust the summation:

$$w_j=\sum_{k=2}^{n} u_{jk}b_k$$

Thus, the definitions are indeed equivalent.