What is the name of the theorem of tridiagonal reduction of symmetric matrices?

332 Views Asked by At

I have over the last year been implementing a code to solve the many-body Schrödinger equation for applications in Nuclear physics, as part of my PhD-studies. The most important step in that code is the use of the Lanczos algorithm to find the lowest eigenvalues of the Hamiltonian matrix. Like most numerical algorithms for diagonalization of symmetric matrices Lanczos reduces the matrix to a tri-diagonal matrix with the same eigenvalues which then can be diagonalized easily. In a summer course I took back in 2017 a lecturer cited a specific theorem that states that any symmetric (diagonalizable) matrix can be reduced to a tri-diagonal matrix. Unfortunately I did not write that down. Now that I'm writing my dissertation I would like to cite that theorem when I explain the Lanczos algorithm. I have been looking for it in every source I can think of but can't find any mentioning of such a theorem. Every text on numerical symmetric matrix diagonalization seem to take that property of symmetric matrices as obvious and don't cite any theorem. I wonder if anyone here knows the name of that theorem, or a source where it is shown?

1

There are 1 best solutions below

15
On

No theorem needs to be invoked in order to perform the similarity transformation in the Lanczos algorithm. It is correct by construction.

In the comments, you cited this link as a reference. In section 2.1, $α_i$, $β_i$, and $v_i$ are defined by Eq. (1). $T$ is defined as the tridiagonal matrix using $α_i$ and $β_i$. $V$ is defined using $v_i$. When you have defined $T$ and $V$ in this way, if you re-write Eq. (1) as a matrix equation using these matrices, you get Eq. (3). And Eq. (3) is an orthogonal similarity transform.

This is why the algorithm is so brilliant. If you pick the construction correctly, you automatically get a similarity transformation.

If you still have questions about the algorithm, then the misunderstanding involves one of the steps above, probably the step where you re-write Eq. (1) as a matrix equation.


As an aside, I want to point out that there are important things left to be proved; it's just that they come after Eq. (3). There are at least three things that pop out at me:

  1. Typically we halt the iteration before getting to $n$ and argue that $T$'s extreme eigenvalues and eigenvectors are similar to $A$'s. This needs proof, and I have never read a full proof of this, but I have heard that it is easy for Hermitian matrices and hard for non-Hermitian matrices (Arnoldi iteration).

  2. How do we know that none of the $v_i$ are zero vectors? (If any of them were zero, then $V$ would not be an orthogonal matrix. We do, however, know that they are all orthogonal by construction.)

  3. How do we know that the algorithm will be numerically stable in practice? (The naive approach isn’t.)