Rewriting product of special rank one updates as a low rank update

52 Views Asked by At

I'm trying to improve the speed of the following iteration to calculate $s_k$:

$$B_k^{-1} = \Bigg( I + \frac{s_{k}s_{k-1}^T}{||s_{k-1}||^2}\Bigg)...\Bigg(I+ \frac{s_1s_0^T}{||s_0||^2}\Bigg) B_0^{-1}\\s_{k+1} = -\frac{B_k^{-1}r_{k+1}}{1+\frac{s_k^TB_k^{-1}r_{k+1}}{||s_k||^2}}$$

If we calculate $B_k^{-1}$ explicitly then each step is roughly $O(n^4)$ to calculate $B_k^{-1}$. If instead we compute $B_k^{-1}r_{k+1}$ by expanding the product from the right, we do $k$ matrix-vector multiplications which is $O(n^3)$, a lot better.

Is there a way to find $s_k$ in the form of a low rank update using the value of previous $s_i$?

1

There are 1 best solutions below

0
On BEST ANSWER

First note that rank-1 updates can be done much more efficiently without explicitly forming the rank-1 matrices. You can rewrite $(I+uv^T)w=w+(v^Tw)u$ so the cost is $O(n)$ if $u$, $v$, and $w$ are $n$-vectors. If $W$ is an $n\times n$ matrix, computing $(I+uv^T)W$ costs $n$-times as for the vector so $O(n^2)$.

There are at least three possibilities:

  1. Compute $B_k^{-1}$ explicitly using $k$ rank-1 updates. The cost of computing $B_k^{-1}$ is $O(kn^2)$ and then $O(n^2)$ to compute $s_{k+1}$ so in total $O(kn^2)$ at iteration $k$.

  2. Compute $s_{k+1}$ without forming $B_k^{-1}$. Computing $B_0^{-1}r_{k+1}$ is $O(n^2)$ unless $B_0$ is very special (diagonal, identity,...). Then applying the $k$ rank-1 updates on the result requires $O(kn)$. Then the total cost depends on the iteration number. Usually $k\ll n$ so the total cost would be $O(n^2)$ per iteration.

  3. Compute $B_k^{-1}$ recursively. We can update $B_k^{-1}$ from $B_{k-1}^{-1}$ as $$ B_k^{-1}=\left(I-\frac{s_ks_{k-1}^T}{\|s_{k-1}\|^2}\right)B_{k-1}^{-1} $$ with the cost $O(n^2)$. The computing $B_k^{-1}r_k$ is again just a mat-vec so the total gets to $O(n^2)$ at iteration $k$.

Out of these three approaches the last one seems to be the most appealing as it results always in $O(n^2)$ operations per iteration.

I assumed here that the $B_0^{-1}$ is dense. If it is sparse, then the second approach would be better as forming $B_0^{-1}r_0$ would no longer be $O(n^2)$ and the total cost would be dominated by the $k$ rank-1 updates resulting in $O(kn)$ at iteration $k$.