Suppose $A$ is an invertible square matrix and $u,v$ are column vectors. Suppose furthermore that $1 + v^T A^{-1}u \neq 0$. Then the Sherman–Morrison formula states that:
$$ (A+uv^T)^{-1} = A^{-1} - {A^{-1}uv^T A^{-1} \over 1 + v^T A^{-1}u} $$
Here, $uv^T$ is the outer product of two vectors $u$ and $v$.
A proof I saw online was here: Direct Derivation of the Sherman-Morrison Formula.
To outline the proof:
They find the inverse by solving the linear problem:
$$ (A-uv^T)x=b $$
They pre-multiply the equation by $A^{-1}$ and let $z = A^{-1}$ and $y=A^{-1}b$. Then,
$$ x-zv^Tx = y $$
They then let $\alpha = v^Tx$, so that they get:
$$ \alpha - v^Tz\alpha = v^Ty. $$
They then solve for $\alpha$:
$\alpha = \dfrac{v^Ty}{1-v^Tz}$
They then claim the solution can be written as $x=y+\alpha z$.
$$ x=y+\alpha z = A^{-1}b+A^{-1}u(1-v^TA^{-1}u)^{-1}v^TA^{-1}b $$
Then, they say that
$$ x = [A^{-1}+A^{-1}u(1-v^TA^{-1}u)^{-1}v^TA^{-1}]b $$
and that the inverse is $A^{-1}+A^{-1}u(1-v^TA^{-1}u)^{-1}v^TA^{-1}$
What I don't understand is how this even works. I do get the gist, in that they claim that if $y=Ax$, then if you use least squares and solve for $x = A^{-1}y$, then whatever is in the A^{-1} should be the inverse.
However, what confuses me is why we solve for $\alpha$ (is this like finding the $\beta$ in a least squares format?) optimized? What is so special about it? Also, why is pre-multiplying the equation by $A^{-1}$ the same as taking derivatives, etc? Thanks!
It's a little easier to use an inner product, $\langle a,b\rangle=a^{T}b=b^{T}a$. This helps separate vectors and scalars. You want to solve for $x$ in this equation, given $b$: $$ (A+uv^{T})x=b \\ Ax+\langle v,x\rangle u = b \\ x +\langle v,x\rangle A^{-1}u=A^{-1}b $$ This gives you a scalar and a vector involving $x$. To turn this into a scalar equation, take the inner product on the left with $v$: $$ \langle v,x\rangle + \langle v,x\rangle\langle v,A^{-1}u\rangle=\langle v,A^{-1}b\rangle $$ Now you can solve for $\langle v,x\rangle$. $$ \langle v,x\rangle = \frac{\langle v,A^{-1}b\rangle}{1+\langle v,A^{-1}u\rangle}. $$ Finally, substitute back into the third equation of this post. \begin{align} (A+uv^T)^{-1}b= x & =A^{-1}b-\langle v,x\rangle A^{-1}u \\ & =A^{-1}b-\frac{\langle v,A^{-1}b\rangle}{1+\langle v,A^{-1}u\rangle}A^{-1}u \\ & = A^{-1}b-\frac{A^{-1}uv^TA^{-1}b}{1+v^TA^{-1}u} \end{align}