Claim: Let A $\in \mathbb{R}^{n \times n }$ be symmetric. Then there is an orthonormal basis of $\mathbb{R}^n$ consisting of eigenvectors of A.
Sketch of proof:
Induction on n. Claim is clear for n = 1. Let $\lambda$ be an eigenvalue of A, $x$ the associated eigenvector. Let $\{x, v_2, \cdots, v_n \}$ be a basis for $\mathbb{R}^n$.
Gram-Schmidt, then we get an orthonormal basis of $\mathbb{R}^n$, $\{b_1, b_2, \cdots, b_n \}$.
We claim that V = span$\{b_2, b_3, \cdots, b_n \}$ is an invariant subspace under A. (Proof omitted)
Then we define T: $V \to V$ to be $Tv = Av$.
Then we take B $\in \mathbb{R}^{(n-1)\times(n-1)}$ as the the matrix representation of T with respect to $\{b_2, \cdots, b_n \}$ and show that B is symmetrical.
By induction hypothesis, we have an orthonormal basis of $\mathbb{R}^{n-1}$ consisting of eigenvectors of B. Let $\{ u_2, u_3, \cdots, u_n\}$ denote this basis.
We take $w_j = \Sigma_{k=2}^n u_{jk}b_k$ for $j = 2, 3 \cdots, n$ where $u_j = (u_{j2}, u_{j3}, \cdots, u_{jn} ) \in \mathbb{R}^{n-1}$.
Then $\{b_1, w_2, \cdots, w_n\}$ is an orthonormal basis for $\mathbb{R}^n$ consisting of eigenvectors of A.
The parts I am having trouble with are:
- The need for us to show that V is invariant under A (I'm kind of guessing it's to do with us being able to represent T like we did, but would like clarification)
- (My main problem:) Why our selection of $w_j$ satisfies our claim.
Any help would be much appreciated, I've been going a bit crazy going through the details. Thanks!
I think this is a very good example where the slightly more abstract proof using linear maps (which you use in any case) instead of matrices simplifies the argument and shows more clearly what is going on.
Let $(W, \left< \cdot, \cdot \right>)$ be a finite dimensional real inner product space and let $T \colon W \rightarrow W$ be a symmetric operator (so $T = T^{*}$). We want to show that $T$ has an orthonormal basis of eigenvectors by induction on $\dim W$. We have two steps:
Regarding your questions,
You need to show that $V$ is $T$-invariant (or $A$-invariant in your case) in order to be able to restrict $T$ to $V$ and obtain an operator. If you don't show that $V$ is $A$-invariant, you can still define $T$ as $T \colon V \rightarrow \mathbb{R}^n$ but then $T$ won't be represented by a square $(n-1)\times(n-1)$ matrix but by a $n \times (n-1)$ matrix and you won't be able to use the induction hypothesis.
Denote by $\mathcal{B} = (b_1, \ldots, b_n)$ and by $\mathcal{C} = (e_1, \ldots, e_n)$ the standard basis. Consider the operator $T \colon \mathbb{R}^n \rightarrow \mathbb{R}^n$ defined by $Tv = Av$. The matrix $A$ represents the operator $T$ with respect to $\mathcal{C}$ while the matrix $B$ represents the action of the operator $T|_V$ with respect to the different basis $\hat{\mathcal{B}} = (b_2, \dots, b_n)$.
An eigenvector $u_i$ of $B$ (an element of $\mathbb{R}^{n-1}$) can't be an eigenvector of $A$ (an element of $\mathbb{R}^n$) but it represents an eigenvector of $A$ in the sense that $u_i$ consists of the coefficients of an eigenvector $w_i$ of $A$ with respect to the basis $\hat{B}$ (the coefficients of $w_i$ don't have a $b_1$ component since $w_i \in \{ b_1 \}^{\perp}$).
Addition: Let $\mathbf{u} = (u_2, \ldots, u_n) \in \mathbb{R}^{n-1}$ be an eigenvector of $B$ with $B\mathbf{u} = \lambda \mathbf{u}$ and set $\mathbf{w} = \sum_{i=2}^n u_i \mathbf{b}_i$. We want to show that $A\mathbf{w} = \lambda \mathbf{w}$. Since $(\mathbf{b}_1, \ldots, \mathbf{b}_n)$ is an orthonormal basis of $\mathbb{R}^n$, it is enough to show that
$$ \left< A \mathbf{w}, \mathbf{b}_j \right> = \left< \lambda \mathbf{w}, \mathbf{b}_j \right> = \begin{cases} \lambda_j u_j & 2 \leq j \leq n \\ 0 & j = 1 \end{cases}.$$
Since $V$ is $A$-invariant and $\mathbf{w} \in V$ we have $\left<A\mathbf{w}, \mathbf{b}_1 \right> = 0$. For $2 \leq j \leq n$ we have
$$ \left< A \mathbf{w}, \mathbf{b}_j \right> = \left< \sum_{i=2}^n u_i A\mathbf{b}_i, \mathbf{b}_j \right> = \sum_{i=2}^n \left< A \mathbf{b}_i, \mathbf{b}_j \right> u_i = \sum_{i=2}^n b_{ij} u_i = \sum_{i=2}^n b_{ji} u_i = (B \mathbf{u})_j = (\lambda \mathbf{u})_j = \lambda_j \mathbf{u}_j $$
where we used the fact that $b_{ij} = \left< A\mathbf{b}_i, \mathbf{b}_j \right>$ and that $B$ is symmetric.