Did I just discover a new way to calculate the signature of a matrix?

1.1k Views Asked by At

Due to the complains for more clarity down below I've cut my post into segments. Feel free to skip right to Definitions, Algorithm & Conjecture. If this is not clear enough, then I'm afraid I can't help it.

Story

I'm taking a course on linear algebra and recently we were covering congruence of matrices. By Sylvester's law of inertia any two real symmetric matrices are congruent if they have the same signatures. We were advised to calculate signatures by considering matrices as bilinear forms and finding their orthogonal bases, which from my point of view is extremely tedious and requires painstaking work, both with regard to memorization and computation. So I was looking out for a better way to do this and by Googling I discovered that simultaneous row and column transformations preserve the signature, which turns out to be quite simple to understand once you consider elementary operations as matrices: $$ \boldsymbol{A'} = \boldsymbol{EAE}^T $$ That is way easier! However, as an extremely lazy person I still wasn't satisfied and here's where the fun part begins:

I began looking at the elementary operations and how they affect the outcome. Multiplying a row by a negative constant might trivially change the signature (just consider identity matrix). After a while I found an example of how interchanging rows might also affect it. And adding a row multiplied by $-2$ to itself is equivalent to multiplying it by a negative scalar. Thus I was left with adding to the row a different row multiplied by a constant and I couldn't find a counterexample for this one. More than that! Using only this operation I got through my previous assignment and by turning a matrix into a row echelon form I was able to get a correct signature in every exercise. It also helped me spot a mistake in my simultaneous row and column operations on the recent test. By this method I calculated 11 correct signatures - it would be very odd if this was just an accident!

I know that chances of me discovering something new in math are infinitesimal but I couldn't resist the clickbaity title. I hope you'll forgive me. But I'm genuinely curious about this one. I tried talking with my professor about it but he seemed uninterested, or maybe I did a poor job explaining it. He just dismissed the entire problem by saying that reduction to a row echelon form does not preserve the signature.

Did I stumble upon some already known algorithm? Why then would no one talk about it at uni? I tried thinking about how to prove this but nothing comes to mind. Perhaps I miss some obvious counterexamples? If so, why did it work in all of the previous exercises?

Definitions

We use this definition of congruence and this definition of signature.

Congruence:We say that two squrare matrices A and B over some field are congruent if there exists an invertible matrix P such that: $$\boldsymbol{A} = \boldsymbol{P}^T \boldsymbol{B P}$$

Signature: A real, nondegenerate $n\times n$ symmetric matrix A, and its corresponding symmetric bilinear form $\boldsymbol{G}(v,u) = v^T \boldsymbol{A} u$, has signature $\boldsymbol{(p,q)}$ (or $\boldsymbol{p-q}$ in a different notation) if there is a nondegenerate matrix C such that $ \boldsymbol{CAC}^T $ is a diagonal matrix with p 1s and q (-1)s.

Algorithm

  1. Using only this operation - adding a row multiplied by a constant to another row - get a matrix to its upper-triangular form.
  2. Let $p$ be a number of positive entries on the diagonal and $q$ be a number of negative ones. Signature of a matrix is equal to $(p,q)$.

Conjecture

The aforementioned Algorithm provides a correct signature for all nondegenerate symmetric square matrices.

Further questions

Why would that be? How to prove it? Any ideas for counterexamples? Does it hold for $3 \times 3$ matrices and below but fails for bigger matrices, as suggested by Ben Grossmann in the comments? Any counterexamples of this sort? In this case - why would it work for n = 3?

Examples

$$ \boldsymbol{A}= \begin{bmatrix} 8 & 8 & 5\\ 8 & 0 & 4\\ 5 & 4 & 3 \end{bmatrix} \overset{r_2 \to r_2-r_1}{\longrightarrow} \begin{bmatrix} 8 & 8 & 5\\ 0 & -8 & -1\\ 5 & 4 & 3 \end{bmatrix} \overset{r_3 \to r_3-\frac{5}{8}r_1}{\longrightarrow} \begin{bmatrix} 8 & 8 & 5\\ 0 & -8 & -1\\ 0 & -1 & -\frac{1}{8} \end{bmatrix} \overset{r_3 \to r_3-\frac{1}{8}r_2}{\longrightarrow} \begin{bmatrix} 8 & 8 & 5\\ 0 & -8 & -1\\ 0 & 0 & 0 \end{bmatrix} $$

And we already see that the signature is (1,1).

Let $x \in \mathbb{R}$ For which values of $x$ the signature of B equals $2$?

$$ \boldsymbol{B} = \begin{bmatrix} 1 & 0 & 1\\ 0 & 2 & 3\\ 1 & 3 & x \end{bmatrix} \overset{r_3 \to r_3-r_1}{\longrightarrow} \begin{bmatrix} 1 & 0 & 1\\ 0 & 2 & 3\\ 0 & 3 & x-1 \end{bmatrix} \overset{r_3 \to r_3-\frac{3}{2}r_2}{\longrightarrow} \begin{bmatrix} 1 & 0 & 1\\ 0 & 2 & 3\\ 0 & 0 & x-\frac{11}{2} \end{bmatrix} $$

And the answer is for $x = \frac{11}{2}$ : $(2,0) = 2$.

Let $t,s \in \mathbb{R} $

$$ \boldsymbol{C}= \begin{bmatrix} 0 & 0 & 0 & 0 & t^2\\ 0 & -1 & 0 & 1 & 0\\ 0 & 0 & 1 & s & 0\\ 0 & 1 & s & s^2-1 & 0\\ t^2 & 0 & 0 & 0 & 0\\ \end{bmatrix} \underset{r_5 \to r_5-r_1}{\overset{r_1 \to r_1+r_5}{\longrightarrow}} \begin{bmatrix} t^2 & 0 & 0 & 0 & t^2\\ 0 & -1 & 0 & 1 & 0\\ 0 & 0 & 1 & s & 0\\ 0 & 1 & s & s^2-1 & 0\\ 0 & 0 & 0 & 0 & -t^2\\ \end{bmatrix} \underset{r_4 \to r_4-sr_3}{\overset{r_4 \to r_4+r_2}{\longrightarrow}} \begin{bmatrix} t^2 & 0 & 0 & 0 & t^2\\ 0 & -1 & 0 & 1 & 0\\ 0 & 0 & 1 & s & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & -t^2\\ \end{bmatrix} $$

Singature is (2,2) for $t\not= 0$ and (1,1) for $t = 0$. I achieved the same result with simultaneous row and column operations, it took twice as long.

1

There are 1 best solutions below

4
On BEST ANSWER

Very nice question and observation! The fact that your method works for the examples provided is not a coincidence and is indeed quite general.

First, let me give you an example which shows that your conjecture cannot hold as stated. Consider the following sequence of operations:

$$ \underbrace{\begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}}_{A_1} \xrightarrow{R_1 = R_1 - R_2} \underbrace{\begin{bmatrix} -1 & 1 \\ 0 & -1 \end{bmatrix}}_{A_2} \xrightarrow{R_2 = R_2 + 2R_1} \underbrace{\begin{bmatrix} -1 & 1 \\ -2 & 1 \end{bmatrix}}_{A_3} \xrightarrow{R_1 = R_1 - R_2} \underbrace{\begin{bmatrix} 1 & 0 \\ -2 & 1 \end{bmatrix}}_{A_4} \xrightarrow{R_2 = R_2 + 2R_1} \underbrace{\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}}_{A_5}. $$

The matrix $A_1$ is symmetric with signature $(0,2)$. The matrix $A_2$ is not symmetric nor diagonalizable but still has $-1$ as a double root of the characteristic polynomial so you might say it still has the "correct signature". The matrix $A_3$ however has eigenvalues $\pm i$. The matrix $A_4$ is not diagonalisable but now has $1$ as a double root of the characteristic polynomial. Finally, the matrix $A_5$ is symmetric with signature $(2,0)$.

You might object and say that this sequence of operations is "silly" but this means that if you start with an arbitrary symmetric matrix, perform arbitrary row addition operations until you get some upper triangular matrix, the signature can't always be read from the diagonal entries.

So why did you get the correct answers? Let's consider first what happened in your first and second example and try to formalize it. You started with a symmetric matrix $A$ and using only the operation of adding a multiple of a row to another row brought $A$ into an upper triangular matrix $U$. More precisely, you did not use a completely arbitrary sequence of row operations but did them in the "standard way" which usually appears in Gaussian elimination. For the $3 \times 3$ examples, you have first used $R_2 = R_2 + cR_1, R_3 = R_3 + c'R_1$ to eliminate all the elements below the diagonal in the first column. Then you used $R_3 = R_3 + c''R_2$ to eliminate all the elements below the diagonal in the second column. Each row operation corresponds to a multiplication of $A$ by an elementary matrix $P$. So you have made a sequence of operations

$$ A \rightarrow P_1 A \rightarrow P_2 P_1 A \rightarrow \dots \rightarrow \underbrace{ \left( P_k \cdots P_1 \right)}_{L} A=U. $$

Since you performed your operations in the "standard way", the matrix $L$ which encodes all the operations is lower triangular with $1$'s on the diagonal. Let's multiply the equation $LA = U$ by $L^T$ on the right to get $LAL^T = UL^T$. Now here comes the magic. Both $U$ and $L^T$ are upper triangular where $L^T$ has $1$'s on the diagonal. But the equation $LAL^T = UL^T$ shows that $UL^T$ is also symmetric and so it must be diagonal! Hence $A$ is congruent to a diagonal matrix whose entries are the diagonal entries on $U$.

This explains why you got the correct answer in the first two examples. Note that it doesn't matter if $A$ is nondegenerate or not (both your examples are degenerate/possibly degenerate). However, this is not a general algorithm because if $a_{11} = 0$ you obviously can't use it to cancel all the entries below. In Gaussian Elimination, you can pivot by using row swap to make this element non-zero. However, this might change the signature of the matrix. In your case you can use a simultaneous row-column operation to make $a_{11}$ non-zero and then proceed with your algorithm but then it might be the case that $a_{22}$ is zero and then you are "stuck" because you only performed "half the operations" so there is no reason that the "signature" will be preserved.

The third example is more interesting because this is precisely the case where you need pivoting. If you keep track of the elementary operations you used, you will see that $P_k \cdots P_1$ is not lower triangular anymore (even though it is "almost" lower triangular) so my argument fails but you still got the correct answer. I'm not sure if it is coincidental or not and will try to analyze it more carefully when I have the time.