Does a method exist to calculate the approximate inverse matrix for a matrix whose elements are small? To be clearer, let's suppose the elements of the matrix $A$ to be function of a small parameter, i.e. $a_{ij} = a_{ij} (\epsilon)$. Is it possible to obtain the elements of the inverse $A^{-1}$ avoiding to calculate it before and then to study the limit for $\epsilon \rightarrow0$. For example could we expand the inverse in a series of $\epsilon$ whose terms can be calculated from $A$?
Approximate inverse matrix
14.5k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
It is in general difficult to compute the inverse of a matrix $A$ when the entries $a_{ij}$ are all near $0$, since the matrix inverse function $A \to A^{-1}$ is so badly behaved as $A \to 0$. When $A$ is near $0$, $A^{-1}$ becomes very large, giving rise to computational instabilities. This can even be seen when $A$ is $1 \times 1$, by looking at the derivative of the map $x \to x^{-1}$: if $x$ is a function of some parameter $\epsilon$, $x = x(\epsilon)$, then we have
$\frac{dx^{-1}(\epsilon)}{d\epsilon} = -x^{-2}\frac{dx(\epsilon)}{d\epsilon}$,
showing that $\frac{dx^{-1}(\epsilon)}{d\epsilon}$ can become very large even when $\frac{dx(\epsilon)}{d\epsilon}$ is small. This situation gives rise to numerical errors which become amplified by a factor of $\frac{1}{x^2}$, often severely degrading the accuracy, hence the utility, of approximation methods for $x$ near $0$. And basically the same thing happens with matrices. If $A(\epsilon)$ is a parametrized family of matrices, differentiable in $\epsilon$, then we may write
$A^{-1}(\epsilon)A(\epsilon) = I$,
and taking $\epsilon$-derivatives yields
$\frac{dA^{-1}(\epsilon)}{d\epsilon}A(\epsilon) + A^{-1}(\epsilon)\frac{dA(\epsilon)}{d\epsilon} = 0$,
from which it follows that
$\frac{dA^{-1}(\epsilon)}{d\epsilon} = -A^{-1}(\epsilon)\frac{dA(\epsilon)}{d\epsilon}A^{-1}(\epsilon)$,
and this equation shows that errors in $A(\epsilon)$, as estimated by $\frac{dA(\epsilon)}{d\epsilon}$ via first-order expansion
$A(\epsilon + \Delta \epsilon) \approx A(\epsilon) + \frac{dA(\epsilon)}{d\epsilon}(\Delta \epsilon)$,
will also be subject to a potentially very large "magnification factor" of magnitude roughly $\vert A^{-1} \vert^2$, where $\vert A^{-1} \vert$ denotes your favorite matrix norm applied to $A^{-1}$:
$A^{-1}(\epsilon + \Delta \epsilon) \approx A^{-1}(\epsilon) + \frac{dA^{-1}(\epsilon)}{d\epsilon}\Delta \epsilon = A^{-1}(\epsilon) - A^{-1}(\epsilon)\frac{dA(\epsilon)}{d\epsilon}A^{-1}(\epsilon)\Delta \epsilon$,
whence
$\vert A^{-1}(\epsilon + \Delta \epsilon) - A^{-1}(\epsilon) \vert \approx \vert A^{-1}(\epsilon)\frac{dA(\epsilon)}{d\epsilon}A^{-1}(\epsilon)\Delta \epsilon \vert \approx \vert A^{-1} \vert^2 \vert \frac{dA(\epsilon)}{d\epsilon}\Delta \epsilon \vert$.
For these reasons and other, similar ones, people try to avoid computing $A^{-1}$ when $\vert A \vert$, hence $A$, is small. What is done, however, is to compute approximations to $A^{-1}(\epsilon)$ when some of the entries of $A(\epsilon)$ are small. For example, if we know $A^{-1}(0)$ and we have
$A(\epsilon) = A(0) + (\Delta A)(\epsilon) = A(0)(I + A^{-1}(0)(\Delta A (\epsilon)))$
with
$\vert A^{-1}(0)(\Delta A (\epsilon)) \vert < 1$,
we can in fact exploit the Neumann series mentioned by Hagen von Eitzen in this comment and 40 votes in his answer to obtain, using the notation $D(\epsilon) = A^{-1}(0)(\Delta A (\epsilon))$,
$A^{-1}(\epsilon) = (I + D(\epsilon))^{-1}A^{-1}(0) = (I - D(\epsilon) + D^2(\epsilon) - D^3(\epsilon) + . . . )A^{-1}(0)$,
which avoids direct computation of $A^{-1}(\epsilon)$ but replaces it with computing the powers $D^i(\epsilon)$ of $D(\epsilon)$. This expression will presumably boil down to a power series in $\epsilon$ if $D(\epsilon)$ is sufficiently nice as a function of $\epsilon$, e.g., if $D(\epsilon)$ is analytic in $\epsilon$.
Finally, in your case
$A(\epsilon) = \frac{1}{\epsilon^2}(I - B(\epsilon))$,
you are fortunate in so far as the factor $\frac{1}{\epsilon^2}$ is a scalar and can be "pulled out in front" of the matrices. Thus all you have to do is evaluate the series for
$(I - B(\epsilon))^{-1} = (I + B(\epsilon) + B^2(\epsilon) + B^3(\epsilon) + . . . )$
and multiply the final result by $\frac{1}{\epsilon^2}$. Care must be taken, however, to compute a sufficient number of terms of the above Neumann series for $(I - B(\epsilon))^{-1}$ that a relatively high degree of convergence is obtained; otherwise, when you multiply by $\frac{1}{\epsilon^2}$ introduction of severe inaccuracies may occur, as I have (hopefully) explained in the above.
Hope this helps. Cheers.
The comment by Hagen von Eitzen refers to Neumann series for the inverse. With $A=\epsilon^{-2}(I-B(\epsilon)$) it takes the form $$A^{-1}=\epsilon^2 (I+B(\epsilon)+B(\epsilon)^2+B(\epsilon)^3+\dots) \tag1$$ A sufficient condition for the convergence of (1) is $\|B(\epsilon)\|<1$.
The necessary and sufficient condition for the convergence of (1) is $\rho(B(\epsilon))<1$, where $\rho$ is the spectral radius.