Trace of symmetric power of a matrix explanation

224 Views Asked by At

Let $A$ be an $n\times n$ complex matrix, seen as a linear map $V\to V$. Then we can define a linear map $S^k(A): S^k(V)\to S^k(V)$. For this map, it makes sense to want to compute $Tr(S^k(A))$.

In a representation theory exercise I was asked to find a formula for $Tr(S^2(A))$ in terms of $tr(A)$ and $tr(A^2)$. The only way I was able to do this was by choosing a basis such that $A$ is in its jordan normal form, and then explicitly workout the eigenvalues of $S^2(A)$, use the fact that the trace is the sum of the eigenvalues, and the a little bit of factorization to notice $tr(S^2(A))=\frac{1}{2}(tr(A)^2+tr(A^2))$.

I however do not like this proof, because it doesn't make the situation clearer for me, it just "works out", but doesn't tell me anything. In particular generalizing this proof seems impossible because the computations will quickly be less and less enjoyable.

My question is, does anybody know of a "nice" proof which is more systematic? Perhaps a concrete description of "nice" could be a proof which doesn't fix a basis.

I know that beauty is in the eye of the beholder, but hopefully you can help me get a more enlightening proof, thank you in advance.

2

There are 2 best solutions below

4
On BEST ANSWER

Edit: Actually there is a completely different argument that avoids both eigenvalues and the detour into symmetric functions! I've added it at the end.


Here is an argument for general $k$ which produces an explicit closed form. It factors into two fairly independent parts: understanding what $\text{tr}(S^k(A))$ looks like in terms of the eigenvalues of $A$, then an exercise in symmetric functions.

First, eigenvalues. As a general fact it is possible to show that any polynomial function of the entries of a matrix $A$ which is invariant under conjugation must be a symmetric polynomial in the eigenvalues $\lambda_i$ of $A$. The easiest example is that the traces $\text{tr}(A^k) = \sum \lambda_i^k$ correspond to the power sum symmetric polynomials $p_k$; the next easiest is that the coefficients of the characteristic polynomial correspond to the elementary symmetric polynomials $e_k$.

$\text{tr}(S^k(A))$ is also a polynomial invariant so it must also correspond to some family of symmetric functions in the eigenvalues of $A$, and by computing the trace in Jordan normal form it is exactly the complete homogeneous symmetric polynomials $h_k$. I actually don't know an argument that avoids reasoning in terms of eigenvalues like this.

The exercise in symmetric function theory is how to express the $h_k$ in terms of the $p_k$. Classic results in symmetric function theory imply that this is always possible (in fact any of the families $p_k, e_k, h_k$ can be expressed in terms of each other) but it's possible to be more explicit as follows. The $h_k$ actually organize into a generating function

$$\sum_{k \ge 0} \text{tr}(S^k(A)) t^k = \sum_{k \ge 0} h_k(\lambda_1, \dots \lambda_n) t^k = \prod_{i=1}^n \frac{1}{1 - \lambda_i t} = \frac{1}{\det(I - At)}$$

and this generating function can be rewritten by first taking its logarithm and then re-exponentiating it again, which gives

$$\begin{eqnarray*} \sum_{k \ge 0} h_k t^k &=& \exp \left( \sum_{i=1}^n \log \frac{1}{1 - \lambda_i t} \right) \\ &=& \exp \left( \sum_{i=1}^n \sum_{k \ge 1} \frac{\lambda_i^k t^k}{k} \right) \\ &=& \exp \left( \sum_{k \ge 1} \frac{p_k t^k}{k} \right). \end{eqnarray*} $$

Rewritten in terms of traces, this says that

$$\boxed{ \sum_{k \ge 0} \text{tr}(S^k(A)) t^k = \frac{1}{\det(I - At)} = \exp \left( \sum_{k \ge 1} \frac{\text{tr}(A^k) t^k}{k} \right) }.$$

This already shows that the $h_k$ can be written as polynomials in the $p_k$ with rational coefficients. But we can be more explicit! The permutation form of the exponential formula can be used to expand out the exponential, and the result is the beautiful identity

$$h_k = \frac{1}{k!} \sum_{\pi \in S_k} p_1^{c_1(\pi)} p_2^{c_2(\pi)} \dots p_k^{c_k(\pi)}$$

where $c_i(\pi)$ is the number of $i$-cycles in the cycle decomposition of the permutation $\pi$. Again rewritten in terms of traces, this gives

$$\boxed{ \text{tr}(S^k(A)) = \frac{1}{k!} \sum_{\pi \in S_k} \text{tr}(A)^{c_1(\pi)} \text{tr}(A^2)^{c_2(\pi)} \dots \text{tr}(A^k)^{c_k(\pi)} }.$$

This immediately reproduces your calculation for $k = 2$; for general $k$ we get a sum over $p(k)$ terms (the number of partitions of $k$, which is the number of cycle types of permutations in $S_k$). For example the $k = 3$ identity is

$$\text{tr}(S^3(A)) = \frac{\text{tr}(A)^3 + 3 \text{tr}(A) \text{tr}(A^2) + 2 \text{tr}(A^3)}{6}$$

where the first term corresponds to the identity permutation, the second term corresponds to the three transpositions $(12), (23), (31)$, and the third term corresponds to the two $3$-cycles $(123), (321)$.

Almost exactly the same argument can be used to deduce a corresponding formula for the elementary symmetric polynomials $e_k = \text{tr}(\Lambda^k(A))$ which I'll leave as a pleasant exercise.


Edit: Here is a sketch of an alternative argument that avoids both eigenvalues and the detour into symmetric functions. Our starting point is the following general fact: if $T : V \to V$ is a linear transformation on a finite-dimensional vector space which preserves a subspace $W$, let $P : V \to V$ be any idempotent projecting onto $W$. Then $T$ has a restriction $T|_W : W \to W$ and so we can ask about the trace of this restriction.

Lemma: With the above hypotheses, $\text{tr}(T|_W) = \text{tr}(TP) = \text{tr}(P T) = \text{tr}(P T P)$.

Sketch. Pick a basis $w_1, \dots w_k$ of $W$, then a basis $w_{k+1}, \dots w_n$ of $\text{ker}(P)$. Since $T$ preserves $W$ by hypothesis, it has an upper triangular block structure when written in this basis, and the upper left block is $T|_W$ expressed in the basis $\{ w_1, \dots w_k \}$. In this basis $P$ itself has block structure $\left[ \begin{array}{cc} I & 0 \\ 0 & 0 \end{array} \right]$, and multiply by $P$ on either the left or the right or both has the effect of zeroing out the lower right block of $T$, so only the upper left block corresponding to $T|_W$ contributes to the trace. $\Box$

We're going to use this fact as follows. If $A : V \to V$ is a linear transformation on a finite-dimensional vector space, we can construct the symmetric power $S^k(A)$ (over a field of characteristic larger than $k$) by first considering the tensor power $A^{\otimes k} : V^{\otimes k} \to V^{\otimes k}$, then restricting $A^{\otimes k}$ to the subspace $\text{Sym}^k(V)$ of symmetric tensors. This is a subspace rather than a quotient space of $V^{\otimes k}$, but in characteristic larger than $k$ (which is a hypothesis we need anyway for the formula given above to make sense) it is naturally isomorphic to the symmetric power $S^k(V)$.

This is significant because we can write down a projection $V^{\otimes k} \to \text{Sym}^k(V)$ to symmetric tensors quite explicitly, namely it is given by averaging over the symmetric group

$$P = \frac{1}{k!} \sum_{\pi \in S_k} \pi.$$

It follows by our lemma that

$$\text{tr}(S^k(A)) = \text{tr}(A^{\otimes k}|_{\text{Sym}^k(V)}) = \frac{1}{k!} \sum_{\pi \in S_k} \text{tr}(A^{\otimes k} \pi).$$

This ought to look quite familiar! Now we need to calculate $\text{tr}(A^{\otimes k} \pi)$ where $\pi \in S_k$ is a permutation. This map decomposes as a tensor product of $c_i(\pi)$ copies of $A^{\otimes i}$ times an $i$-cycle, where $c_i(\pi)$ is the number of $i$-cycles in the cycle decomposition of $\pi$, and since trace is multiplicative with respect to tensor product it suffices to understand the trace of $A^{\otimes i}$ times an $i$-cycle.

To match up with the explicit formula we got above this trace must be equal to $\text{tr}(A^i)$. This is true but I don't have a proof that's super easy to write down. You can do it very explicitly by just computing, carefully, what the diagonal elements of a matrix representation of $A^{\otimes i}$ times an $i$-cycle are, but I think the cleanest way to organize the computation is combinatorially. We can think of $A$ as a weighted adjacency matrix of a directed graph $G$ and $A^{\otimes i}$ is the adjacency matrix of the $i^{th}$ tensor power $G^{\otimes i}$ of that graph. The trace of an adjacency matrix is a sum over all loops in the corresponding graph, and a loop in the graph given by $A^{\otimes i}$ times an $i$-cycle turns out to correspond exactly to a closed walk of length $i$ on $G$, and it's a classical result that these are counted (in a weighted sense) by $\text{tr}(A^i)$.

Edit #2: Also I cannot resist mentioning that if we multiply each permutation by $\text{sgn}(\pi)$ above then we get a corresponding formula for the exterior powers $\text{tr}(\Lambda^k(A))$ because the corresponding projection $P$ projects to antisymmetric tensors, and more generally if we multiply each permutation by the character of an irreducible representation of $S_n$ we get the character of the corresponding Schur functor, which corresponds to projection to the corresponding isotypic component of $S^{\otimes n}$. This gets us what is called the Frobenius characteristic map from characters of the symmetric group to symmetric functions, in the form of characters of the general linear group, and is closely related to Schur-Weyl duality.

0
On

If $A$ is diagonal with eigenvalues $\lambda_i$, then $\operatorname{tr}(S^k A)$ is the sum of all monomials of degree $k$ in $\lambda_i$, often denoted $h_k$. Similarly $\operatorname{tr}(A^k) = \sum \lambda_i^k =: p_k$, the power sum polynomial in $\lambda_i$. With rational coefficients the latter generate the ring of symmetric polynomials, in particular $h_k = f_k(p_1, \dots, p_k)$ for a (unique) rational polynomial $f_k$. It is not difficult to write down these polynomials explicitly for specific $k$ and recursively in general. The example in the question says $f_2(p_1, p_2) = \frac12(p_1^2 + p_2)$.

Now the two expressions $h_k$ and $f_k(p_1, \dots, p_k)$ are polynomials in the entries of $A$ (once you check that the entries of $S^kA$ are polynomials in the entries of $A$). Each is conjugation invariant, so since they agree on diagonal matrices, they in fact agree on diagonalizable matrices. But this is a dense subset of complex matrices, so they must agree for all $A$.