My question pertains to the triangularization theorem, Chapter 6, Section 4, Theorem 5 in Hoffman & Kunze (HK) Linear Algebra. My question in short, if you are already familiar with the text: When I am trying to construct a basis that triangulates the matrix, how do I know that the subspace spanned by the first two vectors is $T$-invariant? More details, and the relevant citations from the text follow.
HK begin by proving the following lemma, which I understand totally:
Lemma. Let $V$ be a finite-dimensional vector space over the field $F$. Let $T$ be a linear operator on $V$ such that the minimal polynomial for $T$ is an product of linear factors $$p = (x - c_1)^{r_1} \cdots (x - c_k)^{r_k}, \ \ c_i \text{ in } F.$$ Let $W$ be a proper ($W \neq V$) subspace of $V$ which is invariant under $T$. There exists a vector $\alpha$ in $V$ such that $\alpha$ is not in $W$ and $(T - cI)\alpha$ is in $W$, for some characteristic value $c$ of the operator $T$.
Then they prove the following theorem (I've only reproduced the reverse direction).
Theorem 5. Let $V$ be a finite dimensional vector space over the field $F$ and let $T$ be a linear operator on $V$. Then $T$ is triangulable if and only if the minimal polynomial for $T$ is a prouct of linear polynomials over $F$.
Proof. Suppose that the minimal polynomial factors $$p = (x - c_1)^{r_1} \cdots (x - c_k)^{r_k}.$$ By repeated application of the lemma above, we shall arrive at an ordered basis $\beta = \{\alpha_1, \dots, \alpha_n\}$ in which the matrix representing $T$ is upper-triangular: $\tag{6-11} \begin{align}[T]_\beta = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \cdots & a_{1n} \\ 0 & a_{22} & a_{23} & \cdots & a_{2n} \\ 0 & 0 & a_{33} & \cdots & a_{3n} \\ \vdots & \vdots & \vdots & & \vdots \\ 0 & 0 & 0 & \cdots & a_{nn}\end{bmatrix}.\end{align}$ Now, (6-11) merely says that $$\tag{6-12} T\alpha_j = a_{1j} \alpha_1 + \cdots + a_{jj} \alpha_j, \ \ 1 \leq j \leq n$$ that is, $T \alpha_j$ is in the subspace spanned by $\alpha_1, \dots, \alpha_j$. To find $\alpha_1, \dots, \alpha_n$, we start by applying the lemma to the subspace $W = \{0\}$, to obtain the vector $\alpha_1$. Then apply the lemma to $W_1$, the space spanned by $\alpha_1$ to obtain $\alpha_2$. Next apply the lemma to $W_2$, the space spanned by $\alpha_1$ and $\alpha_2$. Continue in that way. One point deserves comment. After $\alpha_1, \dots, \alpha_i$ have been found, it is the triangular-type relations (6-12) for $j = 1, \dots i$ which ensure that the subspace spanend by $\alpha_1, \dots, \alpha_i$ is invariant under $T$.
So, I understand that $\{0\}$ is a $T$-invariant subspace, so the lemma provides $\alpha_1$. Next, $\alpha_1$, is an eigenvector so $W_1$ is an eigenspace, and is therefore $T$-invariant, so the lemma produces $\alpha_2$. But for the lemma to proceed to produce $\alpha_3$, we need to be sure that $W_2$ is $T$-invariant. So if $v \in W_2$, then $v = a_1 \alpha_1 + a_2 \alpha_2$ and $Tv = a_1 T \alpha_1 + a_2 T \alpha_2$. Obviously $T \alpha_1 \in W_2$ because $\alpha_1$ is an eigenvector of $T$. But how do we know that $T \alpha_2 \in W_2$?
HK obviously think this point is noteworthy, since their last sentence provides a comment that attempts to justify why $W_2$ is $T$-invariant. But their logic seems circular to me? (The basis produces the matrix...which gives the triangular relations...which justifies the use of the lemma...which produces the basis...which makes the matrix...which gives the triangular relations...etc.)