Why is $LU$ preferred over $A^{-1}$ to solve matrix equations?

695 Views Asked by At

I understand the whole $LU$-decomposition vs Gaussian elimination argument. The fact that you can isolate the computationally expensive elimination step and re-use the $L$ and $U$ matrices for $Ax=b$ style equations with different $b$:s makes sense to me. But I can't seem to find a reason for why the $L$ and $U$ matrices are preferred over an $A^{-1}$ matrix. It can also be used for for multiple $b$:s. So that's my question, why is $LU$ preferred?

3

There are 3 best solutions below

4
On BEST ANSWER

As other commenters have noted, it's $\mathcal{O}(n^3)$ operations to compute either $A^{-1}$ or an $LU$ decomposition and it's also $\mathcal{O}(n^2)$ operations to solve $Ax = b$ once either $A^{-1}$ or an $LU$ decomposition have been computed. So from this point of view, both approaches are equally difficult.

The next level of granularity for analysing matrix computations are flop counts, where we count the total number of floating point operations (additions, subtractions, multiplications, and divisions) as a function of $n$. Usually, we truncate this expression to its highest monomial term.

Going through the analysis, it takes $2n^3/3$ operations to compute an $LU$ factorization of $A$ and $2n^3$ operations to compute $A^{-1}$. Moreover, it costs $2n^2$ operations to solve $Ax = b$ either by triangular substitution from an $LU$ factorization or by multiplying by $A^{-1}$. So even for multiple right hand side problems where we want to solve $Ax = b$ for $m \gg n$ different values of $b$, computing $A^{-1}$ gives you no benefit over an $LU$ factorization. And for single right hand side problems, you've tripled the cost ($2n^3$ vs $2n^3/3$). Tripling the cost of an operation isn't a huge deal, but if you're going to make my code run at one third the speed, you should have a good reason. (If you're willing to accept a higher cost, you might as well solve $Ax = b$ by $QR$ factorization, which has superior stability properties due to the orthogonal and thus perfectly well-conditioned $Q$ matrix.)

A possible response might be: computing $A^{-1}$ is more accurate. But exactly the opposite is true: solving $Ax = b$ by computing $A^{-1}b$ is often much less accurate then computing $U^{-1}L^{-1}b$. The analysis is done in Higham's excellent monograph Accuracy and Stability of Numerical Algorithms, Section 14.1, where he also provides an example where solving $Ax = b$ in double precision produces a $\sim 10^6$ increase in the backwards error versus an $LU$ factorization (with partial pivoting).

There are some rare instances when computing $A^{-1}$ may be valuable, but for solving $Ax = b$, you're taking three times as long to produces an answer with a million times the error.

1
On

$L$ and $U$ are triagonal matrices. The cost of inverting triagonal matrices is much less than the cost of inverting a general matrix $A$.

Specifically, you can "invert" $L$ and $U$ using forward and backward substitution which has cost $\mathcal O(n^2)$, whereas the cost of inverting $A$ is $\mathcal O(n^4)$ (correct me if I'm wrong).

4
On

How would we compute $A^{-1}$? We would need to solve $Ax = e_i$ for each standard basis vector $e_i$. And how would we do that? We wouldn't want to repeat the work of Gaussian elimination each time. So we would instead compute the LU factorization of $A$ for a one-time $O(n^3)$ cost, and use it to solve each system $Ax= e_i$ (for $i = 1, \ldots, n$). So we are going to compute the LU factorization of $A$ anyway. But once we have the LU factorization of $A$, there is no need for the further work of computing $A^{-1}$.