Understanding Sylvester's theorems and change of coordinates

1.9k Views Asked by At

(Sylvester's Theorem). Any quadratic form $q$ over $\mathbb{R}$ with matrix $A$ has the form \begin{equation}q({\bf{v}}) = \sum_{i=1}^t {x^2 _i} - \sum_{i=1}^u x^2_{t+i}\end{equation} with respect to a suitable basis, where $t + u = \text{rank}(A)$.

Equivalently, given a symmetric matrix $A \in \mathbb{R}^{n \times n}$, there is an invertible matrix $P \in \mathbb{R}^{n \times n}$ such that $P^TAP = B$, where $D = (\alpha_{ij})$ is a diagonal matrix with $\alpha_{ii} = 1$ for $1 \leq i \leq t$, $\alpha_{ii} = -1$ for $t + 1 \leq i \leq t + u$, and $\alpha_{ii} = 0$ for $t + u + 1 \leq i \leq n$, and $t + u = \text{rank}(A)$.

I am trying to understand the above and why I would want to perform this coordinate change when I can diagonalise it instead. I tried an example with $q(x,y) = -x^2 + 6xy -9y^2$ :

$$A = \begin{pmatrix} -1 & 3 \\ 3 & -9 \end{pmatrix}, \lambda_1 = -10, v_1 = \begin{pmatrix}\frac{1}{\sqrt{10}} \\ \frac{-3}{\sqrt{10}}\end{pmatrix}, \lambda_2 = 0, v_2 = \begin{pmatrix}\frac{3}{\sqrt{10}} \\ \frac{1}{\sqrt{10}}\end{pmatrix}$$

$$S = \begin{pmatrix} \frac{1}{\sqrt{10}} & \frac{3}{\sqrt{10}} \\ \frac{-3}{\sqrt{10}} & \frac{1}{\sqrt{10}} \end{pmatrix}, S^TAS = \begin{pmatrix} -10 & 0 \\ 0 & 0 \end{pmatrix} = \Lambda$$

Correct me if I am wrong, but from here I believe the normal "nice" change of coordinates ignoring the above theorem is defining:

$$q(x,y) = {\bf{x}}^TA{\bf{x}} = {\bf{x}}^TS\Lambda S^T{\bf{x}} := {\bf{x'}}^T\Lambda{\bf{x'}} = r(x',y')$$

And because $S$ is an orthogonal matrix this means the quadratic form is preserved under the new $(x',y')$ coordinates.

However if we write for Sylvester's theorem:

$$Q = \begin{pmatrix} \sqrt{10} & 0 \\ 0 & 0 \end{pmatrix}, D = \begin{pmatrix} -1 & 0 \\ 0 & 0 \end{pmatrix}$$

To factorise $\Lambda$ as $\Lambda = QDQ^T \iff A = SQD(SQ)^T$ and defining $P:= S{(Q^{-1})}^T = SQ^{-1}$ verifies there does indeed exist an invertible (but not orthogonal!) $P$ s.t. $P^T A P = D$. We also have by the theorem there must be a coordinate change for $q({\bf{v}}) = -x^2$

The new coordinates $(x'', y'')$ defined by ${\bf{x}}^TSQD (SQ)^T{\bf{x}} := {\bf{x''}}^TD{\bf{x''}}$ I computed as $(x-3y, 0)$ which gives $q(x,y)$ multiplying out the matrix vector product.

But the matrix $SQ$ is not orthogonal, $SQ(SQ)^T = SQQ^T S^T = SQ^2 S^{-1} \neq I$ which must mean the transformation deforms whatever geometric shape $q$ represents, is there some sort of geometric meaning to this or application? Sylvester's Law of Inertia which I am told is also an important result tells me that the $t$ and $u$ amounts of $\pm 1$ are invariant under any transformation in the form described in the box above, which implies there are many such transformations, how else would I find these?

2

There are 2 best solutions below

3
On BEST ANSWER

There is no need to begin with orthogonal matrices, which may become difficult to find in dimension 3, 4, or higher.

If the original matrix $H$ is all integers, or all rational, we can find a rational matrix $P$ with $\det P = \pm 1$ so that $P^T HP = D$ is diagonal. The resulting $D$ will obey Sylvester's Law of Inertia. The diagonal entries need not be the eigenvalues.

If the nonzero diagonal entries of $D$ are not all $\pm 1,$ you may then use a diagonal matrix $R$ with entries the reciprocals of some square roots, so that $R^T P^T HPR$ is of the desired form.

$$ P^T H P = D $$ $$\left( \begin{array}{rr} 1 & 0 \\ 3 & 1 \\ \end{array} \right) \left( \begin{array}{rr} - 1 & 3 \\ 3 & - 9 \\ \end{array} \right) \left( \begin{array}{rr} 1 & 3 \\ 0 & 1 \\ \end{array} \right) = \left( \begin{array}{rr} - 1 & 0 \\ 0 & 0 \\ \end{array} \right) $$ $$ Q^T D Q = H $$ $$\left( \begin{array}{rr} 1 & 0 \\ - 3 & 1 \\ \end{array} \right) \left( \begin{array}{rr} - 1 & 0 \\ 0 & 0 \\ \end{array} \right) \left( \begin{array}{rr} 1 & - 3 \\ 0 & 1 \\ \end{array} \right) = \left( \begin{array}{rr} - 1 & 3 \\ 3 & - 9 \\ \end{array} \right) $$

2
On

"Why would I perform this coordinate change when I can diagonalize instead?"

Sometimes we don't want to actually perform the coordinate change; we just want to know that such a change exists. As @WillJagy points out, Sylvester's law says that not only is there such a coordinate change, but the number of $+1$ entries (sometimes called the "signature") is the same for any such coordinate change.

Here's my favorite application:

Take a smooth compact surface without boundary $S$ in 3-space. Consider the function $$ f_v: S \to \Bbb R : (x, y, z) \mapsto (x, y, z) \cdot v $$ For almost every unit vector $v$, this will be a smooth function. To simplify, let's suppose that $v = (0,0,1)$ works, so that $f(x, y, z) = z$.

Now the second derivative of $f$ at each point $P = (x, y, z) \in S$ is a symmetric bilinear form on the tangent plane to $S$ at $P$. If we look at each critical point $Q$ of $f$ (for a sphere, that'd be "the south pole" and "the north pole"), this signature tells you whether the surface 'bends down' at $Q$ (as at the north pole) or "bends up" at $Q$ (as in the south pole, or "bends both ways" (as in the two middle critical points for a bagel that's balanced vertically on a table). It doesn't matter how much the surface bends up or down, hence I don't care about the eigenvalues; all that matters is that in every direction at the south pole, it's bending up. And the signature tells me that.

The cool theorem? The sum (over all critical points) of $(-1)^{\sigma(Q)}$, where $\sigma(Q)$ is the signature, is the same as the Euler characteristic of the surface.

This is all laid out in detail (along with the technical hypotheses required for it to work, like "critical points of $f$ must be isolated") in the first chapter of Milnor's Morse Theory.