Uniqueness of characteristic polynomial of linear transformation in finite fields

2.7k Views Asked by At

Let $T : V \to V$ be a linear transformation of the $F$-vector space $V$. Then using the (abstract) determinant function $\det : \operatorname{Hom}(V, V) \to K$ we can define a function $$ \lambda \mapsto \det(\lambda \cdot \operatorname{id} - T) $$ from $F$ to $F$. Now if we represent $T$ by a matrix $A$ we then have $\det(\lambda \cdot \operatorname{id} - T) = \det(\lambda \cdot I - A)$ where on the RHS the determinant function is an expression in the entries of the matrix. Now if we change the basis, i.e. represent $T$ by a different matrix $B = S^{-1}AS$, then a simple calculation shows that $\det(\lambda \cdot I - B) = \det(S^{-1}(\lambda I - A)S) = \det(\lambda \cdot I - A)$ and hence the value of of the matrix expression does not depend on the basis.

But this is often used as a justification that the characteristic polynomial (i.e. the polynomial $\det(\lambda \cdot I - A)$) is independent of the of the choosen basis. But all I can derive from the above arguments is that the values of the determinant are the same, i.e. if $p(x) := \det(x\cdot I - A)$ and $q(x) := \det(x \cdot I - A)$, then $q(x) = p(x)$ for all $x \in F$ if $B = S^{-1}AS$. If $F$ is infinite, this gives that the polynomials are equal (i.e. have the same sequence of coefficients, and hence the coefficients represent also invariants of the transformation).

But the equality that for $p(x) = a_n x^n + \ldots + a_1 x + a_0$, $q(x) = b_m x^m + \ldots + b_1 x + b_0$ we have $$ p(x) = q(x) \quad \mbox{ iff } \quad m = n, ~ a_i = b_i, ~ i = 0,\ldots, n $$ does not need to hold in finite fields, for example $p(x) = x^2 + x$ and $q(x) = 0$ in $\mathbb Z/2\mathbb Z$.

So then, is the characteristic polynomial (as a formal polynomial, i.e. determined by its coefficients) still unique in the case of finite fields? And if not, do you know an example?

3

There are 3 best solutions below

4
On BEST ANSWER

The characteristic polyonmial $p_A(x)$ of a matrix $A \in \operatorname{M}_n(F)$ is defined as $$ p_A(x) := \det(x I - A) \in F[x], $$ i.e. $p_A(x)$ is the determinant of the matrix $xI - A \in \operatorname{M}_n(F[x])$. (Instead of the commutative ring $F[x]$ one could also use the associated field $F(x)$.)

The argument you give still applies: If $B \in \operatorname{M}_n(F)$ is similar to $A$ over $F$, then they are also similar over $F[x]$, and therefore $xI - A$ and $xI - B$ are similar over $F[x]$. Since similar matrices have the same determinant, it follows that $$ p_A(x) = \det (xI - A) = \det (xI - B) = p_B(x). $$

Note that we are always working over the commutative ring $F[x]$ (or field $F(x)$), so we are only working with polynomials themselves, and not with their associated polynomial functions. So the above equality $p_A(x) = p_B(x)$ really is an equality of polynomials, and not just of their associated polynomial functions.


PS:

As was pointed out in the comments, this leads to the question how to deal with the expression $\det(x \operatorname{id}_V - T)$. There seem to be at least two ways:

Don’t use it

Don’t define the characteristic polynomial of $T$ as $\det(x \operatorname{id}_V - T)$. Instead proceed as follows:

  • Start by defining the characteristic polynomial of $A \in \operatorname{M}_n(F)$ as $p_A(x) := \det(xI - A)$. Because we can regard $F$ is a subring of $F[x]$, or subfield of $F(x)$, this makes sense.
  • Then show that similar matrices have the same characteristic polynomial (as done above).
  • To define the characteristic polynomial of $T \colon V \to V$ take any basis $\mathcal{B}$ of $V$, let $A \in \operatorname{M}_n(K)$ be the matrix of $T$ with respect to $\mathcal{B}$, and set $p_T(x) := p_A(x)$. Since similar matrices have the same characteristic polynomial, this is well-defined.

This still give you everything you need without making sense of $\det(x\operatorname{id}_V - T)$. Note that for every scalar $\lambda \in K$ we still have that $$ p_T(\lambda) = p_A(\lambda) = \det(\lambda I - A) = \det(\lambda \operatorname{id}_V - T), $$ so we can still use the same expression $\det(\lambda \operatorname{id}_V - T)$ when plugging in a scalar $\lambda$ for $x$.

Note that this approach of defining the characteristic polynomial $p_T(x)$ only needs that similar matrices have the same characteristic polynomial, which is precisely what you were concerned about in your question.

Extension of scalars

You can also use extension of scalars, if you are familiar with it:

We can extend the $F$-vector space $V$ to an $F(x)$-vector space $V_{F(x)}$ such that

  • $V \subseteq V_{F(x)}$ is an $F$-linear suspace,
  • the linear map $T \colon V \to V$ extends uniquely to an $F(x)$-linear map $T_{F(x)} \colon V_{F(x)} \to V_{F(x)}$ (i.e. we have that $T_{F(x)}(v) = T(v)$ for every $v \in V$)
  • any $F$-basis $\mathcal{B} = (v_1, \dotsc, v_n)$ of $V$ is also an $F(x)$-basis of $V_{F(x)}$,
  • $[T]_{\mathcal{B}} = [T_{F(x)}]_{\mathcal{B}}$, i.e. the matrix of $T$ with respect to the $F$-basis $\mathcal{B}$ of $V$ coincides with the matrix of $T_{F(x)}$ with respect to the $F(x)$-basis $\mathcal{B}$ of $V_{F(x)}$. (This is a direct consequence of the previous two points).

If $\mathcal{B} = (v_1, \dotsc, v_n)$ is any $F$-basis of $V$, with respect to which $T$ is given by the matrix $A \in \operatorname{M}_n(F)$, then $A$ will also be the matrix of $T_{F(x)}$ with respect to $\mathcal{B}$, when regarded as an $F(x)$-basis of $V_{F(x)}$. Then $x I - A$ will be the matrix of $x \operatorname{id}_{V_{F(x)}} - T_{F(x)}$ with respect to $\mathcal{B}$. (The expression $x \operatorname{id}_{V_{F(x)}} - T_{F(x)}$ makes sense because $V_{F(x)}$ is an $F(x)$-vector space.) With this one can define the characteristic polynomial $p_T(x)$ as $p_T(x) := \det(x \operatorname{id}_{V_{F(x)}} - T_{F(x)})$.

As an example, consider the case of $V = F^n$.

In this case, the extensions of scalars $V_{F(x)} = (F^n)_{F(x)}$ is given by $(F^n)_{F(x)} = F(x)^n$. Then $F^n \subseteq F(x)^n$ is an $F$-linear subspace, and the standard basis $\mathcal{B} = (e_1, \dotsc, e_n)$ of $F^n$ is clearly an $F(x)$-basis of $F(x)^n$.

The $F$-linear map $T \colon V \to V$ is necessarily given by multiplication with the matrix $A \in \operatorname{M}_n(F)$ whose $j$-th column is $T(e_j)$, and the induced $F(x)$-linear map $T_{F(x)} \colon F(x)^n \to F(x)^n$ will be given by multiplication with the same matrix $A$; this makes sense since $\operatorname{M}_n(F) \subseteq \operatorname{M}_n(F(x))$. Hence $x \operatorname{id}_{F(x)^n} - T_{F(x)}$ will be given by the matrix $x I - A$ with respect to $\mathcal{B}$, just as promised.

7
On

The identity you proved $$\det(\lambda \cdot I - B) = \det(S^{-1}(\lambda I - A)S) = \det(\lambda \cdot I - A)$$ is valid as an identity of formal polynomials, not just an equality for all values of $\lambda$.

This is because both sides are polynomials in all of the variables involved, with integer coefficients. You have already proved the identity in fields of characteristic zero. That means that both sides must be identical as formal polynomials. But that implies that over any commutative ring, both sides expand to the same formal polynomial, because the operations involved in expanding out the polynomial (taking the determinant followed by expanding using the commutative, associative, distributive properties) are valid in every commutative ring. The general principle is sometimes called the "principle of permanence of identities" (one reference for this is Artin's book Algebra, Section 12.3, p. 456-457):

To prove that an identity of formal polynomials with integer coefficients (in any number of variables) holds in every commutative ring, it is sufficient to prove it in a single field of characteristic zero.

("Characteristic zero" is important, because I can prove 2=0 in a field of characteristic 2, but that's not valid in every commutative ring.)

Here's an example of this type of argument for a simpler problem:

Prove that for 2 by 2 matrices $A$ and $B$ over a commutative ring $R$, $\det(A) \det(B) = \det(AB)$.

Written out in general form, the identity we want to prove is (the vertical bars denote determinant): $$\begin{vmatrix} a & b \\ c & d \end{vmatrix} \cdot \begin{vmatrix} e & f \\ g & h \end{vmatrix} = \begin{vmatrix} ae+bg & af+bh \\ ce+dg & cf+dh \end{vmatrix}$$

And taking the determinant gives the following form of the identity that we want to prove:

$$ (ad-bc)(eh-fg) = (ae+bg)(cf+dh) - (ce+dg)(af+bh) \tag{*}$$

Here we are treating $a,b,c,d,e,f,g,h$ as indeterminates, not as elements of $R$. It is clear that both sides of the proposed identity are polynomials with integer coefficients in $a,b,c,d,e,f,g,h$. So far, nothing has depended on $R$, because the matrix multiplication and determinant operations are the same no matter what (commutative) ring we're working in.

Now imagine we don't know anything else about matrices or determinants, and we're asked to prove identity (*). How would we do it? The obvious thing is to expand both sides of (*) and collect like terms. If we get the same expanded form on both sides, then we would have proven the identity. Furthermore, this proof would be valid over any commutative ring $R$, because the process of expanding and collecting terms is valid over any commutative ring. It only involves the commutative, associative, and distributive properties, and integer arithmetic, all of which are valid over every commutative ring.

Next, imagine that we don't want to do all this expansion, so we find some other method of proving this identity, but our proof is valid only over one particular field of characteristic zero (say, the complex numbers, where we could use something like topology which wouldn't be valid over a general $R$). Having proven this identity over $\mathbb{C}$, we go back and ask ourselves what would happen if we expand both sides of (*). Must we get the same polynomial on both sides? We must, because if we got two different polynomials on the two sides, then there would have to be some complex numbers we could substitute for $a,b,\ldots,h$ that would produce different values on the two sides. That would contradict the fact that we've proven the identity over $\mathbb{C}$. So, when we expand both sides, in fact the polynomials are the same on both sides, so as we concluded previously, the identity is valid over any commutative ring $R$.

So it suffices to check the identity over a single field of characteristic 0.

2
On

But all I can derive from the above arguments is that the values of the determinant are the same, i.e. if $p(x) := \det(x\cdot I - A)$ and $q(x) := \det(x \cdot I - A)$, then $q(x) = p(x)$ for all $x \in F$ if $B = S^{-1}AS$.

There's your mistake: you want to do this calculation for the specific element $x \in F[x]$, or if you insist on working only over fields, for the element $x \in F(x)$.