Clarification for an argument on the symplectic groups in Stillwell's Naive Lie Theory

131 Views Asked by At

In Section 3.4 (the symplectic groups) it says:

The quaternion matrices A in Sp(n), like the complex matrices in SU(n), are characterized by the condition $A\bar A^T=1$, where the bar now denotes the quaternion conjugtae. The proof is the same as for SU(n).

Section 3.3 (the unitary groups) only says that to achieve this result for the complex case, the arguments of the real case in Section 3.1 go through, whereas the proof in Section 3.1 (orthogonal transformations) relies on the following step:

$A A^T=1$ means $A^T = A^{-1}$, so $1=A^T A=A^T(A^T)^T$, and hence $A^T$ has the same property as $A$.

(That is, the property that $AA^T =1$).

This is what I'm unclear about - why would this follow through for the quaternions? In particular, it's unclear to me why $A^{-1}$ would even make sense for quaternionic matrices.

1

There are 1 best solutions below

0
On BEST ANSWER

If your definition of $A^{-1}$ is "the matrix you get after plugging the entries of $A$ into a bunch of formulas" (i.e. the adjugate matrix divided by the determinant), then you'd be right that talking about $A^{-1}$ for quaternionic matrices $A$ without explanation would be problematic. However, if your definition of $A^{-1}$ is "the matrix that you multiply $A$ by to get the identity matrix," then it is not so problematic.

There are things to think about though. For one, say we have a matrix $B$ such that $BA=I$. Does that automatically mean $AB=I$? It turns out, yes it does, but this is somewhat particular to our situation with finite matrices. (This implication won't be true for linear transformations of infinite-dimensional vector spaces, for example.)

Consider the vector space $M_n(\mathbb{H})$ of $n\times n$-matrices. (As a real vector space, this has dimension $4n^2$.) Define the linear transformation $L_A(X):=AX$. This is a linear transformation of $M_n(\mathbb{H})$. If $X$ is in the kernel, that is if $L_A(X)=0$, then we may left-multiply $AX=0$ by $B$ to obtain $BAX=0$, or in other words $X=0$, so we conclude $L_A$ has no kernel. By the rank-nullity theorem from linear algebra, $L_A$ must be onto, so there exists a matrix, call it $B'$, such that $L_A(B')=I$. Now left-multiply $AB'=I$ by $B$ to obtain $BAB'=B$, or in other words $B'=B$. This proves $BA=I$ implies $AB=I$ for quaternionic matrices, or really matrices over any associative unital ring. (Rings are generally assumed associative, but there are good reasons to study matrices over octonions it turns out, and octonions are nonassociative.)

Second, can we have two matrices $B$ with the property that $AB=I$ and $BA=I$? No; by the previous reasoning, if $BA=I$, then $L_A$ is one-to-one, so there is no other matrix $B$. This tells us that when "left-inverses" exist, they are also right-inverses, and are unique, which justifies calling them $A^{-1}$.

Thus, $AA^{\dagger}=I \iff A^{-1}=A^{\dagger} \iff A^{\dagger}A=I$. (I use $A^{\dagger}$ for conjugate-transpose $\overline{A}{}^T$.) The first equation says the rows of $A$ are orthogonal with respect to $x_1\overline{y_1}+\cdots+x_n\overline{y_n}$ and the second equation says the columns are orthogonal with respect to $\overline{x_1}y_1+\cdots+\overline{x_2}y_2$. Note these are not the same kind of inner product; one is conjugate-linear in the second argument, the other in the first. It's not clear to me if Stillwell recognizes or acknowledges this difference anywhere. Consider for instance

$$ \begin{pmatrix} x_1 & y_1 \\ x_2 & y_2 \end{pmatrix} = \frac{1}{\sqrt{3}} \begin{pmatrix} \mathbf{i} & -1+\mathbf{j} \\ 1+\mathbf{j} & \mathbf{k} \end{pmatrix}. $$

Then $\overline{x_1}y_1+\overline{x_2}y_2=0$ but $x_1\overline{y_1}+x_2\overline{y_2}\ne0$.

My preferences are for conventions different from Stillwell's. First, define $\mathbb{H}^n$ as a right quaternionic vector space. That is, we multiply by scalars $\lambda$ on the right side of a vector $v$. This way, when we left-multiply by a matrix $A$, it is $\mathbb{H}$-linear, i.e. $A(v\lambda)=(Av)\lambda$. Second, define the inner product of column vectors as $\langle u,v\rangle=u^{\dagger}v$. On these conventions, $\langle Ax,Ay\rangle=(Ax)^{\dagger}(Ay)=x^{\dagger}(A^{\dagger}A)y$, and thus $A\in\mathrm{Sp}(n)$ is directly equivalent to $A^{\dagger}A=I$. (If you use $\langle u,v\rangle=u^T\overline{v}$ instead, as Stillwell does, then $(Ax)^T\overline{Ay}$ doesn't simplify, due to $\mathbb{H}$ being noncommutative, which I guess is why Stillwell converts them into complex matrices.)