Suitable matrix decomposition for incorporating diagonal offsets

66 Views Asked by At

The problem: I am faced with a problem where I need to repeatedly calculate:

$$\left( {\bf A} + {\rm diag} \left( {\bf b} \right) \right)^{-1} {\bf c}$$

for a fixed, square, positive definite ${\bf A} \succeq {\bf 0}$ and varying ${\bf b} \geq {\bf 0}$. ${\bf A} \in \Re^{n \times n}$ where $n \approx 1000$ or so (roughly, depending on the problem), and the calculation is done very often so needs to be quick. I would like to be able to factorise ${\bf A}$ upfront (complexity $\mathcal{O} (n^3)$, but one-off), then update the factorisation for each ${\bf b}$ and solve in $\mathcal{O} (n^2)$ or similar (repeated lower-cost computation).

What I've tried: Updating the Cholesky factorisation (my usual go-to) with a diagonal offset requires $\mathcal{O} (n^3)$ computations, as does inverse update, eigendecomposition (afaict) and SVD (afaict), so those aren't feasible. Other matrix factorisations I'm familiar with don't help (either $\mathcal{O} (n^3)$ or worse when I can figure out how to do the operation). I did try a Neumann series type approximation which works sometimes but not always (unsurprisingly given the lack of guarantees on ${\bf A}$ and ${\bf b}$), and even when it does I have doubts about both its computational complexity and its accuracy.

The question: are there any matrix factorisations that let you do diagonal updates ${\bf A} \to {\bf A} + {\rm diag} \left( {\bf b} \right)$ and inversion at less than $\mathcal{O} (n^3)$ computational cost?

fwiw I'm fine with an approximation (so long as it isn't too rough).

UPDATE just a quick note in case anyone finds it helpful in future. I did find what I consider a more elegant approach than the conjugate gradient method for the case where ${\bf b}$ is bounded above as ${\bf 0} \leq {\bf b} \leq {\bf u}$.

Define ${\bf s} = {\bf u} - {\bf b}$. Then ${\bf A} + {\rm diag} ({\bf b}) = {\bf B} - {\rm diag} ({\bf s})$, where ${\bf B} = {\bf A} + {\rm diag} ({\bf u})$ is positive definite. Also: $${\bf y} = ( {\bf A} + {\rm diag} ({\bf b}))^{-1} {\bf c} = ( {\bf B} - {\rm diag} ({\bf s}))^{-1} {\bf c} = ( {\bf I} - {\bf B}^{-1} {\rm diag} ({\bf s}))^{-1} ({\bf B}^{-1} {\bf c})$$

Using the Woodbury matrix inversion Lemma its easy to see that: $$({\bf I} - {\bf P})^{-1} = {\bf I} + {\bf P} + {\bf P}^2 + {\bf P}^3 + \ldots$$ (assuming this is convergent - see note below).

Combining these results we get that: $${\bf y} = {\bf x}_0 + {\bf x}_1 + {\bf x}_2 + \ldots$$ where: $${\bf x}_0 = {\bf B}^{-1} {\bf c} \\ {\bf x}_1 = {\bf B}^{-1} ({\rm diag}({\bf s}){\bf x}_0) \\ {\bf x}_2 = {\bf B}^{-1} ({\rm diag}({\bf s}){\bf x}_1) \\ {\bf x}_3 = {\bf B}^{-1} ({\rm diag}({\bf s}){\bf x}_2) \\ \ldots $$ (again, assume convergence for now...)

This motivates a very simple approach that seems to work surprisingly well:

  1. Pre-compute the inverse ${\bf B}^{-1} = ({\bf A} + {\rm diag} ({\bf u}))^{-1}$.
  2. Later, when you need to calculate ${\bf y} = ({\bf A}+{\rm diag}({\bf b}))^{-1} {\bf c}$ for some ${\bf b}$, calculate the ${\bf x}_0$, ${\bf x}_1$, $\ldots$, ${\bf x}_n$ sequence using the pre-computed inverse, terminating when $\| {\bf x}_n \|$ is "small enough" (either absolute or relative to $\| {\bf x}_0 \|$).
  3. Compute ${\bf y} = {\bf x}_0 + {\bf x}_1 + \ldots + {\bf x}_n$.

That's it. Computational cost is $O(N^2n)$, which is much better than full inversion if $n \ll N$.

Notes:

  1. In my application "good enough" convergence usually happens in a few iterations ($n < 5$ is typical, at least in my experience), and if it goes too long or $\| {\bf x}_i \| > \| {\bf x}_{i-1} \|$ (divergence test failed) then you can just take the efficiency hit and fall back to the full matrix inversion.
  2. If ${\bf A}$ is diagonal then it's easy to prove that this will converge, as it just reverts to the usual series expansion(s) of $\frac{1}{a-s}$, where $0\leq s\leq a$.
  3. I haven't proven convergence in the general case. I suspect it should be easy enough to show using norms and properties of positive definite matrices (maybe SVD might help?), but right now it's not pressing enough to go through.

Anyhow, caveat emptor, ymmv, probably-already-a-known-result, but hopefully that helps the next person.