Fast Gauss-Seidel convergence on low rank matrices

378 Views Asked by At

I stumbled upon the following remarkable fact when experimenting with the Gauss-Seidel iterative solver: First I construct a low-rank symmetric positive semi-definite matrix $A = M^TM$ with M a random (randn) matrix of size (k x n) with k << n (e.g. k = 100, n = 1000). Secondly I estimate the Gauss-Seidel convergence rate as follows:

  • Let $M = D + L$ and $K = U$ ($A = D + L + U$)

  • Consider the eigenvalues $\lambda$ of $M^{-1}K$. All will be $<= 1$ in absolute value (Householder-John theorem that proves GS convergence on SPD matrices). The eigenvalues of value $1$ correspond to the kernel of A, and can I ignore these (assuming I am not interested in the minimum norm solution, but any solution. Inspired by https://mathoverflow.net/questions/80793/is-gauss-seidel-guaranteed-to-converge-on-semi-positive-definite-matrices)

  • So finally let's call the convergence rate the largest eigenvalue below 1. For large matrices with k near n, this is typically very close to 1, which explains the slow convergence of GS. However, when k << n, I find that the convergence rate is <<1, more like 0.01, thus yielding very fast GS convergence.

Execute the code below in matlab to see for yourself

clear;

n = 1000
k = 100

C = randn(k, n);
A = C'*C;

L = tril(A, -1); U = triu(A, 1); D = diag(diag(A)); %A = D + L + U

M = D + L;
K = U;
R = M\K;
e = abs(eig(R));
rho = max(e( abs(1-e) > 1e-10))

Does anyone have an idea why this occurs?

1

There are 1 best solutions below

1
On

So I believe I'm progressing towards the answer.

  • A first observation I find is that for k << n, the condition number of M (and thus A) is very low, even for big matrices. For k == n, the condition number seems bad for big matrices. For formal theory one would have to revert to Random Matrix Theory but I'm far from an expert on that.

  • Secondly, there seems to be a correlation between the condition number of M and A, and the spectral radius of $-K^{-1}M$. Ill-conditioning means slower convergence. I have found independent empirical evidence on this (http://fedcsis.eucip.pl/proceedings/pliks/65.pdf) but that paper is quite poor. Formal theory I do not know of. I'm going to try to relate the eigenvalues of A to those of $-K^{-1}M$.

Feel free to complete and correct!