(-1)^Matrix - general matrix exponential

184 Views Asked by At

I am going through a physics lecture where the following is stated:

$$(-1)^\hat n = \left( \begin{matrix}1 & 0\\ 0 & -1\end{matrix} \right) := \sigma^z$$

The $\hat n$ stands for the number operator in this context and is represented by the matrix $$\hat n = \left( \begin{matrix}0 & 0\\ 0 & 1\end{matrix} \right)$$ and $\sigma^z$ is the conventional Pauli matrix.

How does one arrive at the first statement?

  1. My first guess:
    If I follow my intuition and transferre the calculation scheme of the matrix exponential function for diagonal matrices I can write:

$$(-1)^\hat n = \left( \begin{matrix}(-1)^0 & 0\\ 0 & (-1)^{-1}\end{matrix} \right) = \left( \begin{matrix}1 & 0\\ 0 & -1\end{matrix} \right) $$
However I don't know any mathematical justification for those steps!

  1. My second approach was to express the power as an exponential:

$$ (-1)^\hat{n} = \exp(\hat{n} \ln(-1)) = \exp(i \pi\hat{n})$$ $$ = \exp\left( \left(\begin{matrix}0 & 0\\ 0 & -i\pi\end{matrix} \right) \right) = \left( \begin{matrix}e^0 & 0\\ 0 & e^{-i\pi}\end{matrix} \right) = \left( \begin{matrix}1 & 0\\ 0 & -1\end{matrix} \right)$$
However I am not sure if it is allowed to tranform the power to a exponential expression when matrices are involved.

What is the correct way of doing the calculation and which conditions have to be met in order to do so? I am afraid that the way I am doing it only returns the correct result for some corner cases and is not true in general.

1

There are 1 best solutions below

0
On BEST ANSWER

I'm sure others are better qualified than I am to answer this. But here is how I'd approach it and do the calculations.

If $A$ is an $n\times n$ complex matrix we can define a new matrix $\exp A$ by setting $$ \exp A:=\lim_{N\to\infty}\sum_{k=0}^{N} \frac{A^k}{k!}. $$ The fact that this limit exists can be established by considering the entries of the matrix individually, and comparing with the convergent series for $\exp (n\cdot||A||)$.

This function will satisfy the characteristic property $\exp(A)\exp(B)=\exp(A+B)$.

We can then define for each $r\in\mathbb{R^{>0}}$ a matrix $$ r^{A}:=\exp (A \log r) $$ and this will satisfy the implied requirement, $r^{A+B}=r^A r^B$.

If we want to replace $r$ by a non-zero complex number $z$ then we must take the same sort of care as we do when $A=a\in\mathbb{C}$: we must define $\log z$ in some consistent way (choose a branch).

As to calculation, the easiest case is when $A$ is diagonalisable. In that case we have a matrix $P$ such that $P^{-1}AP=D$ with $D$ diagonal. Then we can justify each step in the following $$ P^{-1}(\exp A) P= \exp (P^{-1} A P)= \exp\left(\begin{bmatrix} \lambda_1 & 0 & \dots &0\\ 0 & \lambda_2 & \dots &0\\ \vdots& &\ddots &\vdots\\ 0 & 0 & \dots &\lambda_n\\ \end{bmatrix}\right)= \begin{bmatrix} \exp \lambda_1 & 0 & \dots &0\\ 0 & \exp \lambda_2 & \dots &0\\ \vdots& &\ddots &\vdots\\ 0 & 0 & \dots &\exp \lambda_n\\ \end{bmatrix} $$ so that $$ \exp(A)=P \begin{bmatrix} \exp \lambda_1 & 0 & \dots &0\\ 0 & \exp \lambda_2 & \dots &0\\ \vdots& &\ddots &\vdots\\ 0 & 0 & \dots &\exp \lambda_n\\ \end{bmatrix} P^{-1}. $$

From this it is immediate that $$ r^A = P \begin{bmatrix} r^{\lambda_1} & 0 & \dots &0\\ 0 & r^{\lambda_2} & \dots &0\\ \vdots& &\ddots &\vdots\\ 0 & 0 & \dots &r^{\lambda_n}\\ \end{bmatrix} P^{-1} $$

If $A$ is not diagonalisable things are more complicated. By the Theorem of the Jordan Canonical Form we see that what we need to do is exponentiate Jordan blocks. This is not too difficult, but let me do the simplest case. Suppose $$ A= \begin{bmatrix} \lambda & 1\\ 0 & \lambda\\ \end{bmatrix} = \lambda I +J, \text{ say}. $$ Then as $J^2=O$ we have that $A^k= \lambda^k I + k \lambda^{k-1}J$. Dividing by $k!$ and adding up we have $$ \exp\left(\begin{bmatrix} \lambda & 1\\ 0 & \lambda\\ \end{bmatrix} \right)= \begin{bmatrix} \exp(\lambda) & \exp(\lambda) \\ 0 & \exp(\lambda) \\ \end{bmatrix}. $$ (This is typical; for larger Jordan blocks we get $\frac{1}{s!}\exp(\lambda)$ on the superdiagonals. )

When we turn to $r^A$, however, things are less straightforward. As $$ \frac{1}{k!}(\log r\lambda I+ \log r J)^k= \frac{1}{k!} (\log r \lambda)^k I + \frac{1}{(k-1)!}(\log r \lambda)^{k-1} \log r J $$ we will finally get $$ r^A= \begin{bmatrix} r^\lambda & r^\lambda \log r\\ 0 & r^\lambda\\ \end{bmatrix}. $$ This extra $\log r$ may be what you had in mind when you said the general case was different? It isn't really surprising, I think we'd expect derivatives on the superdiagonals.