I was confused about the following argument; not listed as a proof of anything but in one of the explanatory blurbs, on Page 193 of Hoffman-Kunze, in $\S 6.3$.
Let $T$ be a diagonalizable linear operator and let $c_1, ... c_k$ be the distinct characteristic values of $T$. Then it is easy to see that the minimal polynomial for $T$ is the polynomial $$p = (x-c_1)...(x-c_k)$$ If $\alpha$ is a characteristic vector, then one of the operators $T - c_1I, ... T- c_k I$ sends $\alpha$ into $0$. Therefore $$(T - c_1) ... (T-c_k I) \alpha = 0 \hspace{3cm} (1)$$
It is the part in bold which does not make sense to me. I understand, obviously, why one of the operators $T - c_i I$ will send $\alpha$ into $0$--specifically, it will be the operator $T - c_i$ where $c_i$ is the characteristic value associated with $\alpha$. But how does it cause the expression in ($1$) to be true, necessarily? The phrasing "one of the operators.." seems to be treating (1) like it's a product of scalars, or something, but this is a product of linear operators--and products of linear operators $UT$ are understood to be a function composition, that is, applying $T$ and then $U$. So I don't know how the reasoning that "one of the operators evaluates to $0$" works. It works, if coincidentally, $\alpha$ is associated with $c_k$. But otherwise, it seems non-obvious as to why this is true. I'm sure it is true, but the lack of explanation seems to suggest that I'm missing some understanding here.
It's very good that you're being cautious of the order of composition, since things do not commute in general. However, $T$ commutes with itself and also with the identity operator; we can treat "polynomials in $T$" quite casually as if they were "ordinary" polynomials where things commute.
$$(T-c_1)\cdots(T-c_kI)\alpha=(T-c_1)\cdots(T-c_k)(T-c_i)\alpha\\=((T-c_1)\cdots(T-c_k))(0)=0$$If $c_i$ is the characteristic value (eigenvalue) of $\alpha$. We really can swap the order. Formally, the claim is that if $f$ and $g$ are polynomials in the ground field then $f(T)\circ g(T)=(f\cdot g)(T)=(g\cdot f)(T)=g(T)\circ f(T)$. Take a moment to convince yourself this is really true. Hint: it suffices to check this for nice simpler cases, such as the one where $f$ is a monomial.
N.B. this is really touching on the fact that if $T:V\to V$ is a linear endomorphism of some vector space over a field $k$ then $V$ is naturally (and usefully!) viewed as a $k[t]$-module via the action $f(t)\cdot x:=f(T)(x)$.