The MINRES algorithm for solving $Ax = b$ for symmetric $A$ can be described as follows: The $k$-th iterate of the algorithm is $$x_k = \arg\min_{K_k(A)} \lVert Ax-b \rVert_2$$ where $K_k(A)=\text{span}\{A^ib\mid i < k\}$ is the $k$-th Krylov subspace of $A$
By this definition, it is clear that it converges to the exact solution in $n$ iterations, if $A$ is an $n\times n$ matrix. By using this solver on matrices with a very specific structure, I noticed that in that case, the solver converges in just 3 iterations. The matrices are of the form $$ A = \begin{bmatrix} I_{n-1} & v\\ v^T & 0 \end{bmatrix} $$ where $v$ is a column vector and and $I_{n-1}$ is the $(n-1) \times (n-1)$ identity matrix. How can this early convergence be explained?
My thoughts
The matrix $A$ can in this case be seen as an Identity matrix plus 2 rank-one corrections: $$A = I_n + \begin{bmatrix} 0 & v\\ 0 & -\frac{1}{2} \end{bmatrix} + \begin{bmatrix} 0 & 0\\ v^T & -\frac{1}{2} \end{bmatrix}$$
This means that we can exactly invert the matrix using the Sherman-Morrison formula twice. I currently avoid doing this explicitly as it leads to instabilities, while 3 MINRES iterations do not. Maybe MINRES algorithm implicitly exploits the fact that we are just a rank-two matrix away from the identity?
You can verify this behaviour with this example python snipet.
The reason is that there is a large Eigenspace corresponding to the eigenvalue $1$. If you solve the eigenvalue problem
$$ \begin{bmatrix} I_{n-1} & v \\ v^T & 0\end{bmatrix} \cdot \begin{bmatrix} x \\ \alpha \end{bmatrix} = \lambda \begin{bmatrix} x \\ \alpha \end{bmatrix} $$
you'll find that there are $n-2$ eigenvectors with $\alpha=0$ and $x\perp v$ corresponding to the eigenvalue $1$ and 2 eigenvectors with $\alpha\neq 0$ and $x\| v$. More specifically in the latter case we have the eigenvectors $x_\pm=v$ and $\alpha_\pm = \frac{1}{2}(-1\pm\sqrt{1+4 v^T v})$ corresponding to the eigenvalue $\lambda_\pm = \frac{-1\pm\sqrt{1+4v^T v}}{2v^T v}$. Note that since $A$ is symmetric, those $EV%$ are orthogonal.
Consequently, given a target vector $y=\begin{bmatrix}b\mid\beta\end{bmatrix}$ we can split it up as a linear combination
$$ y = \underbrace{\mu_1\begin{bmatrix}v\mid\alpha_+\end{bmatrix}}_{=: z_1} + \underbrace{\mu_2\begin{bmatrix}v\mid \alpha_-\end{bmatrix}}_{=: z_2} + \underbrace{\mu_3\begin{bmatrix}v^\perp\mid 0\end{bmatrix}}_{=: r} $$
Consequently, $ A y = \lambda_1 z_1 + \lambda_2 z_2 + r $, $A^2y =\lambda_1^2 z_1 + \lambda_2^2 z_2 + r$, etc. Now it should become clear why we only need 3 iterations: We only need to figure out 3 coefficients: the coefficient to $z_1$, the coefficient to $z_2$ and the coefficient to $r$.
Lemma: assume $A$ is orthogonally similar to the $n\times n$ block matrix $A' = \begin{bmatrix}I_r & 0 \\0 & D\end{bmatrix}$, where $D$ is a diagonal matrix with $D_{ii}\neq D_{jj}\neq 1$ for all $i\neq j$. Then
minresterminates in exactly $n-r+1$ iterations.Proof: Given orthogonal $U$, which diagonalized $A$, we have $$ \|Ax - y\|_2 =\| A U^T Ux - y\|_2 = \|UA U^T U x - U y\|_2 = \|A' x' - y'\|_2 $$ So both optimization problems are equivalent,in the sense that if $x_k = \arg\min_{\xi\in K_k(A)}\|A\xi-y\|$ then $x_k =U^Tx_k'$ where $x_k' = \arg\min_{\xi\in K_k(A')}\|A'\xi-y'\|$. And the early termination in the diagonal case follows by induction over the size of $D$.
Corollary: If $A$ is orthogonally diagonalizable and has exactly $r$ disjoint eigenvalues then minres terminates in exactly $r$ iterations. (in practice this is not guaranteed due to numerical error)