I stumbled upon the following remarkable fact when experimenting with the Gauss-Seidel iterative solver: First I construct a low-rank symmetric positive semi-definite matrix $A = M^TM$ with M a random (randn) matrix of size (k x n) with k << n (e.g. k = 100, n = 1000). Secondly I estimate the Gauss-Seidel convergence rate as follows:
Let $M = D + L$ and $K = U$ ($A = D + L + U$)
Consider the eigenvalues $\lambda$ of $M^{-1}K$. All will be $<= 1$ in absolute value (Householder-John theorem that proves GS convergence on SPD matrices). The eigenvalues of value $1$ correspond to the kernel of A, and can I ignore these (assuming I am not interested in the minimum norm solution, but any solution. Inspired by https://mathoverflow.net/questions/80793/is-gauss-seidel-guaranteed-to-converge-on-semi-positive-definite-matrices)
So finally let's call the convergence rate the largest eigenvalue below 1. For large matrices with k near n, this is typically very close to 1, which explains the slow convergence of GS. However, when k << n, I find that the convergence rate is <<1, more like 0.01, thus yielding very fast GS convergence.
Execute the code below in matlab to see for yourself
clear;
n = 1000
k = 100
C = randn(k, n);
A = C'*C;
L = tril(A, -1); U = triu(A, 1); D = diag(diag(A)); %A = D + L + U
M = D + L;
K = U;
R = M\K;
e = abs(eig(R));
rho = max(e( abs(1-e) > 1e-10))
Does anyone have an idea why this occurs?
So I believe I'm progressing towards the answer.
A first observation I find is that for k << n, the condition number of M (and thus A) is very low, even for big matrices. For k == n, the condition number seems bad for big matrices. For formal theory one would have to revert to Random Matrix Theory but I'm far from an expert on that.
Secondly, there seems to be a correlation between the condition number of M and A, and the spectral radius of $-K^{-1}M$. Ill-conditioning means slower convergence. I have found independent empirical evidence on this (http://fedcsis.eucip.pl/proceedings/pliks/65.pdf) but that paper is quite poor. Formal theory I do not know of. I'm going to try to relate the eigenvalues of A to those of $-K^{-1}M$.
Feel free to complete and correct!