Insightful proofs for Sherman-Morrison Formula and Matrix Determinant Lemma

20.5k Views Asked by At

There are two statement about a matrix under rank-one updates that I would be grateful if you give me some insightful proofs.

Suppose $A$ be a nonsingular $n \times n$ matrix and $\mathbf{u},\mathbf{v}$ be vectors.

First, the Sherman-Morrison formula states that:

$$ \left( A + \mathbf{u} \mathbf{v}^T \right)^{-1} = A^{-1} - \frac{A^{-1}\mathbf{u} \mathbf{v}^TA^{-1}}{1 + \mathbf{v}^TA^{-1}\mathbf{u}}.$$

(We can prove this by verifying that the RHS multiplied by $A + \mathbf{u} \mathbf{v}^T$ is $I$.)

Second, Matrix Determinant Lemma states that:

$$\det\left(A+\mathbf{u}\mathbf{v}^T\right) = \det(A) \left( 1 + \mathbf{v}^T A^{-1}\mathbf{u} \right).$$

(In the proof from wikipedia, we just have to verify some identity again.)

It's easy to verify these proofs but it's not clear to me how to come up with the identity. Are there any other proofs which are not just by multiplication of matrices, and give us some insight ? Or even some informal explanation ?

3

There are 3 best solutions below

5
On BEST ANSWER

Look first at $I + \mathbf{u}\mathbf{v}^\top$. This may require factoring $A + \mathbf{u}\mathbf{v}^\top = A\left(I + A^{-1}\mathbf{u}\mathbf{v}^\top\right)$ which may help the intuition for the term $A^{-1}\mathbf{u}$ in the formulas.

Consider how $I + \mathbf{u}\mathbf{v}^\top$ acts on the vector $\mathbf{u}$: $$\left(I + \mathbf{u}\mathbf{v}^\top\right)\mathbf{u} = \mathbf{u} + \mathbf{u}\mathbf{v}^\top\mathbf{u} = \left(1+\mathbf{v}^\top\mathbf{u}\right)\mathbf{u}$$

This shows that $\mathbf{u}$ is a right eigenvector with eigenvalue of $1+\mathbf{v}^\top\mathbf{u}$. The inverse must have the same eigenvector but with eigenvalue $(1+\mathbf{v}^\top\mathbf{u})^{-1}$. (If $\mathbf{v}^\top\mathbf{u}=-1$ then the matrix is singular.) The rest of the eigenvalues are ones, since any $\mathbf{b}$ such that $\mathbf{v}^\top\mathbf{b} = 0$ gives $\left(I + \mathbf{u}\mathbf{v}^\top\right)\mathbf{b}=\mathbf{b}$. This completes the entire spectrum ( so long as $\mathbf{v}^\top\mathbf{u} \ne -1$), showing eignevalues of ones and the value of $1 + \mathbf{v}^\top\mathbf{u}$.

From here notice that any matrix with such a spectrum must be of the form $I+\mathbf{u}g\mathbf{v}^\top$ (after the factorization of $A$ mentioned earlier) where $g$ is any scalar. This is the general form of matrix that has such a spectrum, with all except one of the eigenvalues as ones, the other eignevalue with the right and left eigenvectors of $\mathbf{u}$ and $\mathbf{v}^\top$ (having eigenvalue parametric in the variable $g$).

Once the necessity of that form is realized, the rest is algebra, finding the value for $g$ that solves the equations. For example, the inverse:

\begin{align} \left(I+ \mathbf{u}\mathbf{v}^\top\right) \left(I+ \mathbf{u}\mathbf{v}^\top\right)^{-1} &= I \\ \left(I+ \mathbf{u}\mathbf{v}^\top\right) \left(I+ \mathbf{u}g\mathbf{v}^\top\right) &=I \\ I+ \mathbf{u}g\mathbf{v}^\top+ \mathbf{u}\mathbf{v}^\top + \mathbf{u}\mathbf{v}^\top\mathbf{u}g\mathbf{v}^\top&=I \\ I+ \mathbf{u}\left(g+ 1 + \mathbf{v}^\top\mathbf{u}g \right)\mathbf{v}^\top&=I \\ \Rightarrow g+ 1 + \mathbf{v}^\top\mathbf{u}g &= 0 \\ g(1+\mathbf{v}^\top\mathbf{u}) &= -1 \\ g = \frac{-1}{1+\mathbf{v}^\top\mathbf{u}} \\ \end{align}

So that we have $$\left(I+ \mathbf{u}\mathbf{v}^\top\right)^{-1} = \left(I+ \mathbf{u}\frac{-1}{1+\mathbf{v}^\top\mathbf{u}}\mathbf{v}^\top\right)$$

2
On

Note that it is sufficient to prove the Sherman-Morrison Formula for $A = I$. Indeed suppose that we proved it for $A=I$ then \begin{align*} (A+\mathbf{u}\mathbf{v}^T)^{-1} &= (A(I+(A^{-1}\mathbf{u})\mathbf{v}^T))^{-1} = \left(I - \frac{(A^{-1}\mathbf{u})\mathbf{v}^T}{1+\mathbf{v}^T(A^{-1}\mathbf{u})}\right)A^{-1} \\ &= A^{-1} - \frac{A^{-1}\mathbf{u}\mathbf{v}^TA^{-1}}{1+\mathbf{v}^TA^{-1}\mathbf{u}}. \end{align*}

Here is a derivation of the Sherman-Morrison Formula for the case $A=I$ when $\|u\| < 1$ and $\|v\| < 1$. \begin{align*} (I+\mathbf{u}\mathbf{v}^T)^{-1} &= \sum_{k=0}^{\infty} (-1)^k (\mathbf{u}\mathbf{v}^T)^k = I + \sum_{k=1}^{\infty} (-1)^k (\mathbf{u}\mathbf{v}^T)^k\\ &= I + \mathbf{u}\left(\sum_{k=1}^{\infty} (-1)^{k} (\mathbf{v}^T\mathbf{u})^{k-1} \right)\mathbf{v}^T = I - \mathbf{u} \times \frac{1}{1 + \mathbf{v}^T\mathbf{u}} \times \mathbf{v}^T\\ &=I - \frac{\mathbf{u} \mathbf{v}^T}{1 + \mathbf{v}^T\mathbf{u}} \end{align*}

1
On

The idea behind Sherman Morrison is to see how a low-rank perturbation affects the inverse. i.e. if we write $$\left(A + uv^T \right)^{-1} - A^{-1} = B$$ we want to know if something can be said about $B$. $$\left(A + uv^T \right) \left( A^{-1} + B\right) = I$$ Hence, $$I + uv^T A^{-1} + AB + uv^T B = I$$ Hence, this shows that $$AB = -uv^T\left(A^{-1} + B \right)$$ is a rank $1$ matrix. Since $A$ is a full-rank matrix, we see that $B$ should be rank $1$. This is the main take-home message of Sherman-Morrison i.e. a rank $r$ update is preserved under inversion. Once this is realized, the exact formula for $B$ can then be obtained from algebraic manipulations.