Confusing Argument about Minimal and Characteristic Polynomials in Hoffman-Kunze's Linear Algebra

73 Views Asked by At

I was confused about the following argument; not listed as a proof of anything but in one of the explanatory blurbs, on Page 193 of Hoffman-Kunze, in $\S 6.3$.

Let $T$ be a diagonalizable linear operator and let $c_1, ... c_k$ be the distinct characteristic values of $T$. Then it is easy to see that the minimal polynomial for $T$ is the polynomial $$p = (x-c_1)...(x-c_k)$$ If $\alpha$ is a characteristic vector, then one of the operators $T - c_1I, ... T- c_k I$ sends $\alpha$ into $0$. Therefore $$(T - c_1) ... (T-c_k I) \alpha = 0 \hspace{3cm} (1)$$

It is the part in bold which does not make sense to me. I understand, obviously, why one of the operators $T - c_i I$ will send $\alpha$ into $0$--specifically, it will be the operator $T - c_i$ where $c_i$ is the characteristic value associated with $\alpha$. But how does it cause the expression in ($1$) to be true, necessarily? The phrasing "one of the operators.." seems to be treating (1) like it's a product of scalars, or something, but this is a product of linear operators--and products of linear operators $UT$ are understood to be a function composition, that is, applying $T$ and then $U$. So I don't know how the reasoning that "one of the operators evaluates to $0$" works. It works, if coincidentally, $\alpha$ is associated with $c_k$. But otherwise, it seems non-obvious as to why this is true. I'm sure it is true, but the lack of explanation seems to suggest that I'm missing some understanding here.

2

There are 2 best solutions below

2
On BEST ANSWER

It's very good that you're being cautious of the order of composition, since things do not commute in general. However, $T$ commutes with itself and also with the identity operator; we can treat "polynomials in $T$" quite casually as if they were "ordinary" polynomials where things commute.

$$(T-c_1)\cdots(T-c_kI)\alpha=(T-c_1)\cdots(T-c_k)(T-c_i)\alpha\\=((T-c_1)\cdots(T-c_k))(0)=0$$If $c_i$ is the characteristic value (eigenvalue) of $\alpha$. We really can swap the order. Formally, the claim is that if $f$ and $g$ are polynomials in the ground field then $f(T)\circ g(T)=(f\cdot g)(T)=(g\cdot f)(T)=g(T)\circ f(T)$. Take a moment to convince yourself this is really true. Hint: it suffices to check this for nice simpler cases, such as the one where $f$ is a monomial.

N.B. this is really touching on the fact that if $T:V\to V$ is a linear endomorphism of some vector space over a field $k$ then $V$ is naturally (and usefully!) viewed as a $k[t]$-module via the action $f(t)\cdot x:=f(T)(x)$.

0
On

Apart from the fact that you can rearrange the order of the commuting operators so that the killing of the characteristic vector (eigenvector) $\alpha$ takes place on the first occasion (and the remaining operators applied are linear so that they cannot resuscitate the zero vector), there is an argument that does not require rearranging.

Since $\alpha$ is an eigenvector of $T$, every polynomial $P[T]$ in $T$ acts on $\alpha$ by a scalar multiplication, namely be the scalar $P[\lambda]$ (where $\lambda$ is the eigenvalue of $\alpha$ for $T$), and the result is then still in the eigenspace of $\alpha$ for $\lambda$. Then the action of the composition of (degree$~1$) polynomials $P_i$ in $T$ in the question acts as scalar multiplication by the product of the factors $P_i[\lambda]$; if (and only if) one of those factors is zero, the application results in the zero vector.

Or you can directly argue that the product $p$ of those polynomials acts by multiplication by $p[\lambda]$, which here is $0$ because $\lambda$ is equal to one of the roots $a_i$ of$~p$.