Transposition in the regular representation

578 Views Asked by At

I am trying to understand why we take the transpose of a matrix representing the action of a group, specifically for the regular representation at the moment, but I don't yet know if it has significance for other representations. I have read the 'explanation' in Riley, Hobson and Bence (RHB), and in my lecture notes, and some lecture notes online, however it is simply not solid (?) in my mind- it's not clicking.

I have seen what seem to be different approaches at explanation, sometimes in the same text. I presume they are all actually equivalent. For example:

  • In the distinction between permuting objects and positions/labels, and in parallel transforming vectors and vector coefficients.
  • Riley, Hobson and Bence additionally refers to the transposition being necessary too 'preserve the matrix multiplication rules'. To me this is quite mysterious, because the representation and the ways of applying the matrices of the representation has not yet been laid down, so this motivation seems to be plucked out of the blue.
  • I have seen approaches considering the dual vector space online and mentioned in a footnote in RHB. If this explanation would be illuminating, I would be interested to hear.
1

There are 1 best solutions below

0
On

Usually, I just remember it as an algebraic trick. It's easy enough to see why it's necessary on purely algebraic grounds. But it is also possible to understand why on more intuitive grounds, with an example that goes back to graphing functions in elementary algebra. The whole idea is how to go from a group action on a set to one on a collection of functions on that set.

Let's go over the algebra. The most prototypical situation is just having a group act on a set. Recall this means each group element $g\in G$ gets interpreted as a function on a set $X$. This is a left group action, so we can write $g(hx)=(gh)x$ to say that the group operation in $G$ corresponds to composition of functions on $X$. But there is also a notion of right group action, which is when again each group element is interpreted as a function but instead written on the right and satisfy the alternate relation $(xg)h=x(gh)$. If we write $\phi_g\in\mathrm{Perm}(X)$ for the function corresponding to the element $g\in G$, then this means $\phi_h\circ \phi_g=\phi_{gh}$. And that the map $G\to\mathrm{Perm}(X)$ given by the assignment $g\mapsto \phi_g$ is an antihomomorphism. Antihomomorphisms reverse the group operation. If you compose two antihomomorphisms together, you get a plain old homomorphism.

Most often naturally occuring group actions are left actions, but there are the rare occasions when it is more natural to describe a right action. And sometimes you have both left and right actions of groups that "commute" with each other, which is notationally nice since $(gx)h=g(xh)$ for $g\in G$ acting from the left and $h\in H$ acting from the right. This occurs with e.g. monodromy and deck transformations in covering space theory and $GL(V)\curvearrowright V^{\otimes n}\curvearrowleft S_n$ in representation theory.

There is a trick that allows us to convert between left actions and right actions though: precompose the (anti)homomorphism $G\to\mathrm{Perm}(X)$ with the inversion map $G\to G:g\mapsto g^{-1}$. This converts between the two because inversion is an antiautomorphism, $(gh)^{-1}=h^{-1}g^{-1}$. If we have a right action, say we notate it in the alternate way by $X\times G\to X:(x,g)\mapsto xg$, we can define a left action $G\times X\to X:(g,x)\mapsto xg^{-1}$, where the notation $xg^{-1}$ utilizes the right action.

It's easy to check this works: $g(hx)=(xh^{-1})g^{-1}=x(h^{-1}g^{-1})=x(gh)^{-1}=(gh)x$.

The most common use for this trick is in going from a (left) action $G\curvearrowright X$ to a (left) action $G\curvearrowright\hom(X,Y)$, where $\hom(X,Y)$ denotes the (or a) set of functions $X\to Y$. Consider some function $f:X\to Y$. Arguably the most obvious thing to do to get $G$ to act is replace $f(x)$ with $f(gx)$; if $X=\mathbb{R}$ and $f(x)$ were given by some formula you could replace every instance of $x$ with $gx$ and then write the formula for $gx$. But this is not a left action! If you apply $h$ first to $f(x)$ you get $f(hx)$, then if you apply $g$ you get $f(hgx)$, which matches what you get when you apply $hg$ to $f(x)$, not $gh$. It is a right action!

One way to think about this is that in the notation $f(gx)$, don't consider $g$ as acting on $x$ from the left, think of it as $g$ act on $f$ from the right (since $g$ is written literally on the right of $f$). It's pretty clear that $(f\circ \phi_g)\circ \phi_h=f\circ(\phi_g\circ \phi_h)$.

Here's a weird consequence: applying permutations to tuples can give a right action $X^n\curvearrowleft S_n$,

$$ (x_1,x_2,\cdots,x_n)\sigma=(x_{\sigma(1)},x_{\sigma(2)},\cdots,x_{\sigma(n)}) $$

because a tuple $(x_1,x_2,\cdots,x_n)$ is formally a function $\{1,2,\cdots,n\}\to X$. Thus, to get a left action $S_n\curvearrowright X^n$ you need to replace $\sigma$ by $\sigma^{-1}$ in the indices:

$$ \sigma(x_1,x_2,\cdots,x_n)=(x_{\sigma^{-1}(1)},x_{\sigma^{-1}(2)},\cdots,x_{\sigma^{-1}(n)}). $$

Ironically, this is exactly how you swap the value in the $i$th position to the $\sigma(i)$th position, as one would want a permutation $\sigma$ acting on a tuple to do. Check it yourself with $n=3$.

(A similar thing happens with $V^{\otimes n}\curvearrowleft S_n$, where $V^{\otimes n}=V\otimes V\otimes\cdots\otimes V$.)

Another consequence is shifting the graph of functions in the plane. Consider the graph of $y=f(x)$. How do we shift this to the right $c$ units? Answer: draw the graph of $f(x-c)$. This is ironic because $x\mapsto x-c$ shifts the number line to the left. (Note $\mathbb{R}$ is an abelian, additively-written group, so previous discussion of $xg^{-1}$ becomes $x-c$.) This is because if you want to shift the graph of $f$ to the right $c$ units, you must add $c$ to each $x$ coordinate, which means ...

$$ \{(x+c,f(x)):x\in\mathbb{R}\}=\{(x,f(x-c)):x\in\mathbb{R}\}. $$

You can write out the same thing for $(gx,f(x))$ and $(x,f(g^{-1}x))$. If you can fully understand why we use $f(x-c)$ to move a graph to the right, then you understand why we use $f(g^{-1}x)$ instead of $f(x)$ for a left action.

To reiterate, I think fully understanding the $X^3\curvearrowleft S_3$ and shifting graphs of functions examples are the most important to really grok.

The replacing $g$ by $g^{-1}$ trick works with antiautomorphisms in other contexts too. For instance, quaternion conjugation can be used to get a $2\times2$ complex matrix representation of quaternions which treats $\mathbb{H}$ as a left complex vector space (although personally I think it's more natural to make $\mathbb{H}$ a right vector space and avoid that). Or you can use the transpose for matrices, since it is an antiautomorphism, $(AB)^T=B^TA^T$. The transpose is just another way of going from a matrix acting on a set (of vectors) to acting on a collection of functions (of vectors).

Given any vector space $V$, the dual vector space $V^{\ast}$ is the vector space of all linear functions $V\to\mathbb{R}$ (assuming we're talking about real vector space). Given any element $\phi\in V^{\ast}$, the composition $\phi\circ A$ is another element of $V^{\ast}$ (exercise) and the map $\phi\mapsto \phi\circ A$ makes $V^{\ast}$ into a right $\mathrm{End}(V)$-module, as opposed to $V$ which is a left $\mathrm{End}(V)$-module.

Picking coordinates (with respect to some basis), vectors in $V$ may be represented by column vectors, there is some dual basis (and thus also coordinates) on $V^{\ast}$ so its elements too can be represented by (say) row vectors, in which case $\phi(v)$ where $\phi\in V^{\ast}$ and $v\in V$ can be interpreted as matrix multiplication $\phi v$ where $\phi$ is a row vector and $v$ a column vector. The column vector corresponding to $\phi$ is the transpose $\phi^T$, and if we write $\phi(Av)=(\phi A)v$ we see the column vector corresponding to $\phi A$ is $A^T\phi^T$. Thus, if you want to represent $A$'s action on the dual vector space $V^{\ast}$ you can treat $V^{\ast}$ as $V$ and use the tranpose $A^T$.

With coordinates, $V\cong\mathbb{R}^n$ and $\mathrm{End}(V)\cong M_n(\mathbb{R})$, by the way.

And then there's the regular representation $\rho:G\to GL(\mathbb{C}G)$, where $\mathbb{C}G$ is the free vector space with basis $G$. Every element of $\mathbb{C}G$ is a formal linear combination $\sum_{g\in G} c_g g$, but we can think of the coefficients as a function $c:G\to\mathbb{C}$. The element $\rho(h)$ acts as left-multiplication by $h$:

$$ \rho(h)\sum_{g\in G}c_g g=\sum_{g\in G} c_g (hg) = \sum_{g\in G} c_{h^{-1}g} g $$

and so the coefficient function $c(g)$ gets replaced by the function $c(h^{-1}g)$.

The same trick goes for permutation representations in general, i.e. where $G$ acts on a set $X$ and hence on the free vector space $\mathbb{C}X$ too. Any permutation representation is unitary, so if you use $X$ as a basis for $\mathbb{C}X$ in order to write $\rho(g)$ as a matrix (technically you need to order $X$ too for that but whatever), the inverse $\rho(g)^{-1}$ is really just $\rho(g)^T$.

In RHB (beginning pg1078) it seems we are using the component functions $u_i$ of a vector $\rm u$ as elements of the dual vector space $V^{\ast}$, whilst vectors are elements of the original vector space $V$. But instead of starting out with a matrix acting on $V$ and getting its transpose to represent its action on $V^{\ast}$ (but still using vectors in $V$), it instead starts with a matrix describing an action on $V^{\ast}$ ("its effect on each basis function $u_i$ is determined") and represents it with the transpose being applied to $V$ (consisting of column vectors).