I'm reading some treatment of generalized eigenvectors in a differential equations book. They want to derive that there are as many generalized eigenvectors to a certain eigenvalue $\lambda_i$, as the algebraic multiplicity of this eigenvalue $\lambda_i$. This result is used to proof that there is a basis transformation putting the matrix in the Jordan canonical form.
It goes as follows: let $p(x)=\prod_{i=1}^q(x-\lambda_i)^{m_i}$ be the characteristic polynomial of some $\mathbf{A}\in\mathbb{R}^{n\times n}$, thus $\sum_{i=1}^qm_i=n$, with $m_i$ the algebraic multiplicity of eigenvalue $\lambda_i$. Then, by Cayley-Hamilton, $$\mathbf{0}=\prod_{i=1}^q(\mathbf{A}-\lambda_i\mathbf{I})^{m_i},$$ and from here it is concluded that $S_i=\{\mathbf{v}:(\mathbf{A}-\lambda_i\mathbf{I})^{m_i}\mathbf{v}=\mathbf{0}\}$ is a vector subspace of dimension $m_i$.
I don't get where this last assertion comes from. I have seen a (completely different) proof of the Jordan normal form theorem before. I would proof Cayley-Hamilton from the Jordan normal form theorem, and because eigensystems of similar matrices are equal, we can easily see that $S_i$ is a subspace of dimension $m_i$ using the Jordan normal form of $\mathbf{A}$.
Can anyone explain me how this last conclusion is made without the Jordan normal form theorem?
Let $V$ be a finite-dimensional vector space over $\mathbb{C}$ with $\dim V= n $ and let $T \in L(V)$.
Lemma $1$.
Proof.
Pick a basis $B$ for $V$ of the form $$B = (\text{basis for }V_1, \ldots, \text{basis for }V_k)$$ The matrix of $T$, and hence also $T - \lambda I$, w.r.t. $B$ is block-diagonal so the claim follows by taking the determinant.
Lemma $2$.
Proof.
The polynomial $(x-\lambda)^n$ annihilates $T|_{G(\lambda)}$ so $\emptyset \ne \sigma\left(T|_{G(\lambda)}\right) \subseteq \{\text{zeroes of } (x-\lambda)^n\} = \{\lambda\}$. Hence $p_{T|_{G(\lambda)}} = (x - \lambda)^{d}$.
Lemma $3$.
Proof.
Recall that for every $v \in V$ exists a unique monic polynomial $p_v \in \mathbb{C}[x]$ of minimal degree such that $p_v(T)v = 0$. Since the minimal polynomial $m_T$ annihilates $T$, it follows that $m_T(T)v = 0$ so we have $p_v\mid m_T$ and hence $\deg p_v \le \deg m_T \le n$.
Notice that $\ker (T - \lambda I)^n \cap \operatorname{Im}(T - \lambda I)^n = \{0\}$. Indeed, if $y \in \ker (T - \lambda I)^n \cap \operatorname{Im}(T - \lambda I)^n$ then $\exists x \in V$ such that $y = (T - \lambda I)^{n}x$. We have $(T - \lambda I)^ny = (T - \lambda I)^{2n}x = 0$ so $p_x \mid (x - \lambda)^{2n}$. Therefore $y = (T - \lambda I)^{n}x = 0$.
Rank-nullity theorem now implies $V = \ker (T - \lambda I)^n \oplus \operatorname{Im}(T - \lambda I)^n$.
Now we are ready to prove our result. Notice that $\lambda$ cannot be an eigenvalue of $T|_{\operatorname{Im}(T - \lambda I)^n}$. Indeed, if there exists an eigenvector $y = (T - \lambda I)^n$, then $(T - \lambda I)y = (T - \lambda I)^{n+1}x = 0$ which implies $y = (T - \lambda I)^{n}x = 0$.
By Lemmas $1$, $2$ and $3$ we have $$p_T(x) = p_{G(\lambda)}(x)p_{T|_{\operatorname{Im}(T - \lambda I)^n}}(x) = (x - \lambda)^d p_{T|_{\operatorname{Im}(T - \lambda I)^n}}(x)$$
Notice that the factors are relatively prime because $\lambda$ is not a root of $p_{T|_{\operatorname{Im}(T - \lambda I)^n}}$. Uniqueness of factorization in $\mathbb{C}[x]$ now implies $$d = \text{algebraic multiplicity of }\lambda$$