TLDR: Group characters and characteristic polynomial have a very similar function but are introduced in very different terms. Can characteristic polynomial be understood through the lens of character theory?
I am trying to get familiar with representation theory by using it to understand the "inner workings" of finite dimensional linear algebra.
Trying to figure out how one could come up with the idea of characteristic polynomials, I stumbled upon this MSE post. Building the characteristic polynomial from traces sounds appealing to me. In particular it makes it sound related to the characters of group representations. So I went on and tried to build the characteristic polynomial in a similar fashion to the characters of a finite group.
I will be working exclusively over $\mathbb{C}$ and representation will mean finite-dimensional representation. Here is a quick reminder on characters for a finite group $G$.
Consider the group algebra $\mathbb{C}[G]$ with its basis $(e_g)_{g\in G}$. Interpreting its elements as functions it is a commutative algebra and interpreting them as "distributions" it has a convolution product $\star$.
Its convolution center $Z(\mathbb{C}[G])$ corresponds to the so-called central functions on $G$.
To any representation $\rho : \mathbb{C}[G]\to \operatorname{End}(V)$ is associated a character $$\chi_\rho = \sum_{g\in G} \operatorname{tr}(\rho(g)) e_g$$
Characters of representations are central functions. They satisfy the following properties ($V$ and $W$ are representations of $G$) :
- If $V\simeq W$ then $\chi_V = \chi_W$
- $\chi_{V \oplus W} = \chi_V \oplus \chi_W$
- More generally, for $V\subseteq W$, $\chi_W = \chi_V \oplus \chi_{W/V}$
- $\chi_{V \otimes W} = \chi_V \chi_W$
- $\chi_{V^*} = \check{\chi} : g\mapsto \chi(g^{-1})$
Representations of a finite group are unitary thus semi-simple. It turns out that characters of irreductible representations form a linear basis of $Z(\mathbb{C}[G])$, which is moreover orthonormal for the following inner product : $$ <f_1,f_2> = \frac{1}{|G|} (f_1\star f_2)(e) $$ As irreducible representations also generate all the representations, we obtain a map from representations of $G$ to $Z(\mathbb{C}[G])$ which totally characterises representations. A sophisticated way to put it is that we obtain a map $$ \operatorname{Gr}(\operatorname{Rep}_G) \xrightarrow{\chi} Z(\mathbb{C}[G]) $$ which exhibits $Z(\mathbb{C}[G])$ as the complexification of the Grothendieck ring of the representations of $G$. The character gives us a computable characterisation of the representations.
Now onto characteristic polynomials.
We are working with a vector space $E$ equipped with an endomorphism $u$.
The endomorphism $u$ is equivalently an action of the monoid $\mathbb{N}$ or more practically of the polynomial algebra $\mathbb{C}[\mathbb{N}]\simeq \mathbb{C}[X]$.
We are thus interested by finite dimensional representations of $\mathbb{C}[X]$. Be careful that polynomial multiplication is the convolution product.
It is known that indecomposable representations of $\mathbb{C}[X]$ are of the form $\mathbb{C}[X]/(X-\lambda)^k$. In the Grothendieck group there is an identification $$[ \mathbb{C}[X]/(X-\lambda)^k ] = [ (\mathbb{C}[X]/(X-\lambda))^k ]$$ which is exactly the data forgotten by the characteristic polynomial. The characteristic polynomial gives us a map $$ \operatorname{Gr}(\operatorname{Rep}_{\mathbb{N}}) \xrightarrow{} \mathbb{C}[X] $$ which here again seems to be a complexification map. Of course the structure theorems directly justify these properties but I am looking for a different route.
If we naively copy the trace definition of characters, we obtain $$ \chi_u = \sum_{i\geqslant 0} \operatorname{tr}(u^i) X^i = \operatorname{tr} \left(\frac{1}{1-Xu} \right) $$ This is obviously not the characteristic polynomial.
The characteristic polynomial is the reciprocal polynomial to the following one : $$ P_u(X) = \det(1-Xu) = \operatorname{gtr} (\Lambda^\bullet u : \Lambda^\bullet V \to \Lambda^\bullet V) = \sum_{i\geqslant 0} (-1)^{i} X^i \operatorname{tr}(\Lambda^i u) $$ with $\Lambda^\bullet u$ the induced graded endomorphism of $\Lambda^\bullet V$. The graded trace expression comes from the aforementioned MSE post. I want motivation for this, and tried to compare it to group characters. To begin with, the dependency in $(E,u)$ is no longer additive : as $$ \Lambda^\bullet (E_1\oplus E_2) \simeq \Lambda^\bullet E_1\otimes \Lambda^\bullet E_2 $$ and $$ \Lambda^\bullet (u_1\oplus u_2) \simeq \Lambda^\bullet u_1\otimes \Lambda^\bullet u_2 $$ we now have the following morphism law : $$ P_{u_1\oplus u_2} = \det(1-X(u_1\oplus u_2)) = \det(1-Xu_1)\det(1-Xu_2) = P_{u_1} P_{u_2} $$ so that instead of linear decompositions in $Z(\mathbb{C}[G])$ decomposing a representation will correspond to a polynomial factorisation. I am particularly puzzled by the involvement of the convolution product. I also think the reciprocity relation between $P_u$ and the usual $\det(X-u)$ may have to do with convolution.
In a sense, these are "exponential" characters. There is a simple formula for the logarithm of $\det(1-Xu)$ which is mentioned here : $$ \det(1-Xu) = \sum_{i\geqslant 0} (-1)^{i} X^i \operatorname{tr}(\Lambda^i u) = \exp \left( -\sum_{j\geqslant 1} \frac{1}{j}X^j \operatorname{tr}(u^j) \right) = \exp \big[-\operatorname{tr}\left( \ln(1-Xu) \right) \big] $$ so that there is an expression for the associated "linear" character. It is indeed close but different from the naive suggestion $\operatorname{tr} \left(\frac{1}{1-Xu} \right)$. Now of course the characteristic polynomial is particularly convenient as it is a finite degree polynomial, but I wonder if there is a way to connect the dots.
There is a chance the similarity is only formal. After all, the same polynomial construction can be applied to a group action : $$ P_\rho = \det(1-X\rho) = \sum_{i\geqslant 0} (-1)^{i} X^i \chi_{\Lambda^i \rho} \in \mathbb{C}[G][X] $$ so that the convolution algebra $\mathbb{C}[G]$ and convolution algebra $\mathbb{C}[X]$ may have unrelated origins.
My main question is :
- Can one start from the idea of characters of representations and end up building the characteristic polynomial?
One related question about a fundamental property of the characteristic polynomial :
- Is there a character-theoretical proof of the Cayley-Hamilton theorem? (without relying on the structure theorem) ?
I would say they are very related, but the relationship is better understood through eigenvalues. Start with a general algebra $A$ over characteristic zero, and a representation $V$ of it. Then the character of $V$ is a linear map $\chi_V:A\rightarrow k$, satisfying certain conditions. The important fact is that in characteristic zero, the data of the map $\chi_V$ is equivalent data to giving the multi set of eigenvalues for the action of each element of $A$. This uses characteristic zero, and is equivalent to the assertion that the power sum symmetric functions generate the whole ring of symmetric functions.
So the general process of taking the character (in characteristic zero) is exactly taking the information of the eigenvalues of each element of $A$ on $V$. In general, this procedure loses information, it’s only in very special cases that it doesn’t, like group algebras in characteristic zero. My claim is that this taking of eigenvalues is actually a more useful way of thinking about characters, since it works well in all characteristics, unlike the trace. For example, considering the eigenvalues yields Brauer characters in the case of modular representation theory of finite groups.
Now for the characteristic polynomial, this is just another way of encoding the multi set of eigenvalues. So the data of the character is equivalent in characteristic zero to the data of the characteristic polynomial of every element acting on the representation.
So what about $\mathbb{C}[X]$? Representations of this encode the notion of “a single linear map acting on a vector space”, and the characteristic polynomial is an efficient encoding of this fundamental invariant, the multi set of eigenvalues. We also see that in this case, the invariant isn’t complete, but it’s a reasonable measurement.
As far as proving the Cayley Hamilton theorem goes, I think the semisimple case is immediate from the product of eigenvalues perspective. The general case seems to use knowledge of what extension problems actually occur (or a zariski density argument), and doesn’t seem to follow formally from properties of this “multi set of eigenvalues” perspective.