Find the inverse of a submatrix of a given matrix

10.9k Views Asked by At

I have a $A$ matrix $4 \times 4$. By delete the first row and first column of $A$, we have a matrix $B$ with sizes $3 \times 3$. Assume that I have the result of invertible A that denote $A^{-1}$ before. I want to use the result to calculate the inverse of $B.$ Is it possible? Have any method to do it? Thanks

Let consider a simple case that A is invertible then B also invertible.

5

There are 5 best solutions below

9
On BEST ANSWER

Note that $A$ differs from $\begin{bmatrix} 1 & 0 \\ 0 & B \end{bmatrix}$ by at most a rank two "update". Therefore formulas which "update" the matrix inverse $A^{-1}$ can be used to give an inverse of $B$:

$$ \begin{bmatrix} 1 & 0 \\ 0 & B \end{bmatrix}^{-1} = \begin{bmatrix} 1 & 0 \\ 0 & B^{-1} \end{bmatrix} $$

A rank one update to $A$ corresponds to a rank one update to $A^{-1}$ by the Sherman-Morrison formula:

$$ (A + uv^T)^{-1} = A^{-1} - \frac{A^{-1}uv^T A^{-1} }{1 + v^T A^{-1} u} $$

The more general Woodbury formula could be used here to give the inverse of the rank two update to $A$ directly as a rank two update to $A^{-1}$.

$$ \left(A+UCV \right)^{-1} = A^{-1} - A^{-1}U \left(C^{-1}+VA^{-1}U \right)^{-1} VA^{-1} $$

The parameter $C$ above is redundant in the sense that we can take it to be the identity (and this is often done) choosing $U,V$ to "absorb" any factor $C$. In the case at hand:

$$ A = \begin{bmatrix} a & v_1 & v_2 & v_3 \\ u_1 & & & \\ u_2 & & B & \\ u_3 & & & \end{bmatrix} $$

we can take $U = \begin{bmatrix} 1 & 0 \\ 0 & u_1 \\ 0 & u_2 \\ 0 & u_3 \end{bmatrix}$ and $V = \begin{bmatrix} 0 & v_1 & v_2 & v_3 \\ 1&0&0&0 \end{bmatrix}$, so that:

$$ UV = \begin{bmatrix} 0 & v_1 & v_2 & v_3 \\ u_1 & 0 & 0 & 0 \\ u_2 & 0 & 0 & 0 \\ u_3 & 0 & 0 & 0 \end{bmatrix} $$

Then $A - UV = \begin{bmatrix} a & 0 \\ 0 & B \end{bmatrix}$ and provided $a \neq 0$ the update can be done by inverting $(I - VA^{-1}U)$, a $2\times 2$ matrix.

For the small sizes of matrices $A$ and $B$ (resp. $4\times 4$ and $3\times 3$) such computations may promise little improvement over a direct computation of $B^{-1}$. As the Comments indicated interest in removing multiple $k$ columns and rows from larger $n\times n$ matrices, the Woodbury formula can be used for this. However one needs to keep in mind the cost of inverting the $2k\times 2k$ matrix $I + VA^{-1}U$ when $2k$ is a significant fraction of original size $n$.

3
On

You can find an example in which $B$ is not invertible:

$$A=\begin{bmatrix}0 & 1 & 0 & 0\\ 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 1 \end{bmatrix}$$

meaning that what you want to do is impossible.

3
On

Maybe you can use one of the wikipedia pages:

http://en.wikipedia.org/wiki/Invertible_matrix#Blockwise_inversion

or

http://en.wikipedia.org/wiki/Block_matrix_pseudoinverse

The pseudoinverse will have to be used in case B is not invertible. For instance the case in the other answer by 5xum.

0
On

Block-inverse and Schur-complement were both mentioned but not developed into a closed form expression, so perhaps the following is still useful.

Defining a unit vector $r$ that selects the single row and column to be deleted from the matrix, and the $n \times n-1$ complement matrix $R$ such that $R R^T + r r^T = I$, we can block-decompose a matrix as $$ M = R A R^T + R B r^T + r C R^T + r D r^T, $$ where $A = R^T M R$, $B = R^T M r$, $C = r^T M R$ and $D = r^T M r$. If $D$ is nonzero and the Schur-complement $M/D$ is invertible we can similarly decompose the inverse as $$ M^{-1} = R (M/D)^{-1} R^T + R \dots r^T + r \dots R^T + r \dots r^T. $$

Isolating the first block (using $r^T R = 0$ and $R^T R = I$) we obtain $$ (R^T M^{-1} R)^{-1} = M / D = A - B D^{-1} C = R^T \left( M - \frac {M r r^T M} {r^T M r} \right) R. $$

Substituting $M = P^{-1}$ we obtain an expression for the inverse of a submatrix of $P$ in terms of $P^{-1}$: $$ (R^T P R)^{-1} = R^T \left( P^{-1} - \frac {P^{-1} r r^T P^{-1}} {r^T P^{-1} r} \right) R. $$

0
On

Here I discuss how to generalize hardmath's (excellent) answer as follows:

  • $k$ rows and $k$ columns are removed instead of just one.
  • The rows and columns may be at any location in the matrix (not necessarily the front, and not necessarily in a contiguous block)

I came across this question because I needed to solve this more general problem for a numerical method I am working on. Conceptually, all the key ideas are in hardmath's answer, but I found it still took a bit of work to do the generalization to a level of detail where I could implement it on a computer efficiently. I am posting my findings here to save future people the trouble. I also include python/numpy code which anyone may use as they see fit.


Let $\mathcal{R}^g$ be an index set of "good" rows of $A$, and let $\mathcal{R}^b$ be the complementary index set of "bad" rows of $A$, so that $\mathcal{R}^g \cup \mathcal{R}^b = \{1,2,\dots,N\}$, where $A$ is an $N \times N$ matrix.

Likewise, let $\mathcal{C}^g$ be an index set of "good" columns of $A$, and $\mathcal{C}^b$ be the complementary set of "bad" columns of $B$.

We seek to solve the linear system $$A\left(\mathcal{R}^g,\mathcal{C}^g\right) ~x = b, \qquad (1)$$ where $A\left(\mathcal{R}^g,\mathcal{C}^g\right)$ denotes the submatrix of $A$ consisting of the "good" rows and columns. (The same notation as how you select a submatrix in Matlab)

For example, if $\mathcal{R}^g=(1,3)$, $\mathcal{C}^g=(5,4)$, and $N=5$, then the matrix in equation (1) becomes $$A\left(\mathcal{R}^g,\mathcal{C}^g\right) = A\left((1,3),(5,4)\right) = \begin{bmatrix} A_{15} & A_{14} \\ A_{35} & A_{34} \end{bmatrix}, $$ and the complementary "bad" index sets are $\mathcal{R}^b=(2,4,5)$ and $\mathcal{C}^b=(1,2,3)$. (the ordering of the bad index sets is not important so long as one keeps it consistent)

Now consider the matrix $\widetilde{A}$ which is the same as $A$, except matrix entries corresponding to interactions between "good" and "bad" index sets are set to zero. I.e., \begin{align} \widetilde{A}\left(\mathcal{R}^g,\mathcal{C}^g\right) &= A\left(\mathcal{R}^g,\mathcal{C}^g\right) \\ \widetilde{A}\left(\mathcal{R}^b,\mathcal{C}^b\right) &= A\left(\mathcal{R}^b,\mathcal{C}^b\right) \\ \widetilde{A}\left(\mathcal{R}^g,\mathcal{C}^b\right) &= 0 \\ \widetilde{A}\left(\mathcal{R}^b,\mathcal{C}^g\right) &= 0. \end{align} Also, let $\widetilde{b}$ be the vector with $\widetilde{b}(\mathcal{R}^g)=b$, and $\widetilde{b}(\mathcal{R}^b)=0$, where $\widetilde{b}(\mathcal{R}^g)$ denotes the entries of $\widetilde{b}$ corresponding to indices in $\mathcal{R}^g$, and likewise for $\widetilde{b}(\mathcal{R}^b)$.

Since these cross interactions in $\widetilde{A}$ are zero, if we solve $$\widetilde{A} \widetilde{x} = \widetilde{b}, \qquad (2)$$ then $$x\left(\mathcal{C}^g\right) = x.$$ In other words, the solution of (1) is found by extracting the $\mathcal{C}^g$ entries from the solution of (2).

Now, like in hardmath's answer, $\widetilde{A}$ can be written as a low rank update of $A$ in the form $$\widetilde{A} = A - UV,$$ and system (2) can be solved via the Woodbury formula: \begin{align} \widetilde{x} = \widetilde{A}^{-1}\widetilde{b} &= \left(A - UV\right)^{-1}\widetilde{b} \\ &= A^{-1}\widetilde{b} + A^{-1}U\left(I_{2k \times 2k} - VA^{-1}U\right)^{-1}VA^{-1}\widetilde{b}. \end{align} But here $U$ is a $N \times 2k$ matrix, $V$ is a $2k \times N$ matrix, and $I_{2k \times 2k}$ is the $2k \times 2k$ identity matrix, where $k$ is the number of "bad" rows/columns (size of $\mathcal{R}^b$ and $\mathcal{C}^b$).

The matrices $U$ and $V$ are given as follows: \begin{align} U\left(\mathcal{R}^g, (1,\dots,k)\right) &= 0 \\ U\left(\mathcal{R}^b, (1,\dots,k)\right) &= I_{k\times k} \\ U\left(\mathcal{R}^g, (k+1,\dots,2k)\right) &= A\left(\mathcal{R}^g, \mathcal{C}^b\right) \\ U\left(\mathcal{R}^b, (k+1,\dots,2k)\right) &= 0 \end{align} \begin{align} V\left((1,\dots,k), \mathcal{C}^g\right) &= 0 \\ V\left((1,\dots,k), \mathcal{C}^b\right) &= A\left(\mathcal{R}^b, \mathcal{C}^g\right) \\ V\left((k+1,\dots,2k), \mathcal{C}^g\right) &= I_{k\times k} \\ V\left((k+1,\dots,2k), \mathcal{C}^b\right) &= 0. \end{align}


Here is some python/numpy code demonstrating how to do this.

First, building the matrices $U$ and $V$ and verifying that they are correct:

import numpy as np

N = 102
A = np.random.randn(N, N)

bad_rows = [2,   1, 37, 15]
bad_cols = [80, 60, 40, 55]
k = len(bad_rows)

good_rows = np.setdiff1d(np.arange(A.shape[0]), bad_rows)
good_cols = np.setdiff1d(np.arange(A.shape[1]), bad_cols)

A_tilde_true = A.copy()
A_tilde_true[:, bad_cols] = 0.0
A_tilde_true[bad_rows, :] = 0.0
A_tilde_true[np.ix_(bad_rows, bad_cols)] = A[np.ix_(bad_rows, bad_cols)]

U = np.zeros((A.shape[0], 2*k))
U[bad_rows, :k] = np.eye(k)
U[good_rows, k:] = A[np.ix_(good_rows, bad_cols)]

V = np.zeros((2*k, A.shape[1]))
V[:k, good_cols] = A[np.ix_(bad_rows, good_cols)]
V[k:, bad_cols] = np.eye(k)

A_tilde = A - np.dot(U, V)
err_UV = np.linalg.norm(A_tilde_true - A_tilde) / np.linalg.norm(A_tilde_true)
print('err_UV=', err_UV)

which outputs: "err_UV= 0.0"

And now using $U$ and $V$ with the Woodbury formula to solve a linear system with the submatrix:

b = np.random.randn(N-k)
x_true = np.linalg.solve(A[np.ix_(good_rows, good_cols)], b)

Z = np.linalg.solve(A, U) # Z = inv(A)*U
C = np.eye(2*k) - np.dot(V, Z) # C = I - V*inv(A)*U

b_tilde = np.zeros(N)
b_tilde[good_rows] = b

x_tilde = np.linalg.solve(A_tilde, b_tilde) # x0 = inv(A)*b
x_tilde += np.dot(Z, np.linalg.solve(C, np.dot(V, x_tilde))) # dx = inv(A)*U*inv(C)*V*x0
x = x_tilde[good_cols]

err_woodbury = np.linalg.norm(x - x_true) / np.linalg.norm(x_true)
print('err_woodbury=', err_woodbury)

which outputs "err_woodbury= 9.049341577063206e-14" (i.e., zero, up to floating point rounding errors).