What is the advantage of LU decomposition over storing the elimination matrix directly?

382 Views Asked by At

The answers to this question indicate that one of the advantages of LU decomposition is that the algorithm do not redo the elimination steps.

Why do we insist on storing the factorized form of $L$ and $U$? Why not just save $E$ and $U$, where

\begin{align} Ax = b &\implies EAx = Eb && \text{(multiply both sides by $E$)} \\ &\implies Ux = Eb && \text{(since $EA = U$)} \end{align}

As you can see, we just need to store $U$ and the elimination matrix $E$, so whenever we have a new matrix $b$ on the right-hand side, we can just multiply it out with $E$, and use back-subsitution to solve for $x$. What's advantage of storing $L$ and $U$ over storing $E$ and $U$?

1

There are 1 best solutions below

4
On

Note that in your case, the matrix $E$ is $L^{-1}$. So one might just go even further and store $U^{-1}$ as well or even $A^{-1}=U^{-1}L^{-1}$.

So why not just store $A^{-1}$? In a word, accuracy. A matrix $A$ has a so-called condition number $\kappa(A)$. The error $\| x_{\rm computed} - x\|$ in solving $Ax = b$ is roughly $\kappa(A)\times 10^{-16}$ in double precision arithmetic (more generally, $10^{-16}$ is roughly the unit roundoff for your floating point number system). However, even when $\kappa(A)$ is large, conventional $LU$ factorization with back- and forward-substitution tends to be backwards stable in the sense that the residual norm is small: $\|Ax_{\rm computed} - b\| \approx 10^{-16}$. In some applications, a small residual is more important than a small error, so this backwards stability is a desirable property.

Unfortunately, if you compute and store $A^{-1} = U^{-1}L^{-1}$, then the computed solution is (usually) no longer backwards stable and the residual is much larger $\|Ax_{\rm computed} - b\| \approx \kappa(A)\times 10^{-16}$. The error $\| x_{\rm computed} - x\|$ is usually of the same magnitude, but is usually a bit larger than if we just did back substitution.

So if we store $E=L^{-1}$, our residual norm will usually be higher. But not higher like $\|Ax_{\rm computed} - b\| \approx \kappa(A)\times 10^{-16}$, but higher like $\|Ax_{\rm computed} - b\| \approx \kappa(L)\times 10^{-16}$. We pay for the condition number of $L$ not $A$. In practice, $L$ is usually fairly well-conditioned even if $A$ is not. For instance, in some tests I did with a randsvd random matrix $A$ with condition number $10^{14}$, the condition number of $L$ computed with partial pivoting is only like $10^2$. So the residual norm is only like $100\times $ larger than with back- and forward- substitution.

There are other reasons to prefer storing $L$ over $E$. For example, if $L$ is a sparse matrix, $E$ may be less sparse and thus more computational costly to multiply by.

You're usually not losing a lot of accuracy or speed by the approach you advocate, but it's not clear exactly what you're gaining by your approach since you are losing something in accuracy by storing $E$ instead of $L$. Unless you're in a setting where you have a good reason to use $E$ instead of $L$ (which do exist), I would suggest sticking with $L$.