Why are Hermitian matrices called a generalization of real symmetric matrices?

910 Views Asked by At

I frequently see people explaining Hermitian matrices as a generalization of real symmetric matrices e.g. Wikipedia, Math StackExchange. I understand that all real symmetric matrices are Hermitian matrices, but it seems like there's really two changes between real symmetric matrices and Hermitian matrices: (1) permit complex numbers, and (2) the transpose must equal the element-wise complex conjugate. Why was this second step included in the definition of Hermitian matrices? My guess would be that it has something to do with physicists wanting to ensure particular sequences of linear transformations produce real-valued outputs; is this correct?

2

There are 2 best solutions below

2
On BEST ANSWER

The "dot product" in $\Bbb R^n$ has the property that $$ d(x) = \sqrt{x \cdot x} = \sqrt{x^t x} $$ is a metric. In particular, $x \cdot x = \sum_i x_i^2$ is always a nonnegative real, so we can take a square root. You can fancy this up and look at $$ \sqrt{x^t M x} $$ for some matrix $M$, or (generalizing a little) look at a product defined by $$ (x, y) \mapsto x^t M y. $$

If you want that to be symmetric (which is a nice thing for generalized inner products), then $M$ has to be symmetric.

If you try to do the same thing for a complex vector $z$, with complex number entries $z_i$, and define $$ z \cdot z = \sum_j z_j^2, $$ then the resulting sum is usually not real. (Example: if each $z_j$ is $\sqrt{-1}$, then...)

But if you say that $$ z \cdot z = \sum_j z_j \overline{z}_j, $$ then you DO get something that's a nonnegative real, and can mimic all the stuff you did for $\Bbb R^n$. But when it comes to the matrix $M$, you need (to get symmetric of your generalized inner product) not that $M^t = M$, but that $\overline{M}^t = M$...so that's where the generalization comes from.

1
On

Yes; in physics, observables such as the position or momentum of a particle are modeled with linear operators on a Hilbert space (a vector space over $\mathbb{R}$ or $\mathbb{C}$ with a dot product and other nice properties). In order for this interpretation to make sense, the eigenvalues (which correspond to probabilities or measurements) must be real-valued.

Another way to see why this might be the correct definition is from looking at the dot-product between vectors $v$ and $w$. In $\mathbb{R}^n$, this is given by $$v\cdot w = \sum_{i=1}^nv_iw_i $$ but in the complex vector space $\mathbb{C}^n$ it is given by $$v\cdot w = \sum_{i=1}^nv_i\overline{w_i}; $$ $A$ being Hermitian is equivalent to the property that $$Av\cdot w = v\cdot Aw$$ for any vectors $v, w$. The $A = \overline{A^T}$ definition guarantees this to be true, and the presence of the complex conjugate therefore comes from the dot product. For general linear operators (not just matrices), we say the function is self-adjoint or Hermitian (https://en.wikipedia.org/wiki/Self-adjoint_operator).

The reason we think of them as like the real numbers is we can think of the operation $A \mapsto \overline{A^T}$ as a type of inversion, just like complex conjugation is an inversion on the complex plane. The real numbers are exactly those that satisfy $\overline{x} = x$, i.e. they are equal to their image under the inversion. Hermitian matrices are exactly those that satisfy the same property for the adjoint inversion operation.