I work in the field and FEM and CFD and usually come across such a sparse saddle-point system when dealing with Dirichlet boundary conditions.
$\begin{bmatrix} K & S \\ S^T & 0 \end{bmatrix} \begin{bmatrix} x \\ \lambda \end{bmatrix} = \begin{bmatrix} f \\ d \end{bmatrix}$
Here $x\in R^n$ and $\lambda \in R^m$, with $m < n$. $S_{n\times m}$ is of full rank of $m$ and $K_{n\times n}$ is symmetric and non negative. When $K$ is of full rank we could use Uzawa's method and its variations.
However, sometimes I have a $K_{n \times n}$ not of full rank but its nullity less than $m$, so the whole system is still of full rank (when $K$ is the stiffness matrix and $S$ denotes linear constraints). In this situation what alternatives do we have?
In my particular case I have $m\ll n$ and I don't care about $\lambda$, the method I am using is described below:
We can assume $x = x_0 + z$ and can always find a $x_0$ such that $d-S^Tx_0=0$ (such as the minimal norm solution $S(S^TS)^{-1}d$), then we change the system to the following
$\begin{bmatrix} K & S \\ S^T & 0 \end{bmatrix} \begin{bmatrix} z \\ \lambda \end{bmatrix} = \begin{bmatrix} f - Kx_0\\ d - S^Tx_0 \end{bmatrix} = \begin{bmatrix} c\\ 0 \end{bmatrix} $
We can define subspace $W=column(S)$ and its orthogonal complement $W^{\perp}$, the linear problem equivalently becomes finding $z\in W^{\perp}$ so that $ c-Kz = r\in W$. Then I can define the orthogonal projection along $W$ onto $W^{\perp}$, which is $P = I - S(S^TS)^{-1}S^T$, and the problems becomes finding $z\in W^{\perp}$ to solve $r = P(c-Kz)=0$. I know $PK$ is rank-deficient but Conjuate Gradient just works fine because every vectors involved ($z$,$r$,$p$,$PKp$) always stay within $W^{\perp}$. I tested this method on many systems with dimension of $K$ up to 10 million and it always converges quickly to a unique solution.
As for finding $x_0$, we can start with an arbitrary vector $y\in R^n$ and say finding a vector $x_0$ among all vectors satisfying $S^Tx-d=0$ that minimizes $||y-x||_2$. Simply applying Lagrange multiplier method we have $x_0=Py+S(S^TS)^{-1}d$.
So do you think my method is mathematically sound? It seems much simpler than Uzawa's method or its variations and can handle rank-deficient $K$. Did I come up something useful or people have being using this method for long time but I didn't know? Obviously the drawback of the method is when $m$ becomes large it is expensive to compute $(S^TS)^{-1}$. In my particular problems most columns of $S$ are orthogonal to each other so it is not a issue for me.