Why does $L$ have to be lower triangular in the LU factorization?

1.5k Views Asked by At

I was studying LU factorization, when I didn't understand a particular phrase, or rather, how it works or why it works.

During LU factorization L is said to be a lower triangular matrix and U is said to be an upper triangular matrix, for A=LU.

All my book says is: "Suppose A can be reduced to an echelon form U using only row replacements that add a multiple of one row to another row below it. In this case, there exist unit lower triangular elementary matrices $E_1, E_2..... E_p$ such that $$ E_p....E_1A=U. $$ Then $$ A = (E_p...E_1)^{-1}U=LU, $$ where $$ L=(E_p...E_1)^{-1}. $$ All the book says is the unit lower triangular matrix exists not mentioning why it exists or how it exists. A substantial amount of theory missing here. Could anyone please explain this?

3

There are 3 best solutions below

0
On

There is a slightly different LU decomposition called the Cholesky decomposition that creates two upper triangular matrices. The reason primarily is how it constructs the matrices. Gaussian elimination is simple to illustrate and I will show you. I recommend Trefethan and Bau personally.

Gaussian Elimination without Pivoting is as follows. The primary purpose of Gaussian elimination, if you follow this, is to find $\ell_{jk}$ which zeros out the row below. That is why it is the ratio of the two rows and then you subtract them. This continues on and on. Why does it have to be lower triangular? Because that is an identity matrix and it is only entering below the diagonal.

enter image description here

Suppose that

$$ A = \begin{bmatrix} 1 & 1 & 1 \\ 3 & 5 & 6 \\ -2 & 2 & 7 \end{bmatrix} $$ $$ A = LU $$

$$ U =A, L=I$$ $$ k=1,m=3,j=2$$ $$\ell_{21} = \frac{u_{21}}{u_{11}} = \frac{a_{21}}{a_{11}} = 3 $$ $$ u_{2,1:3} = u_{2,1:3} - 3 \cdot u_{1,1:3} $$ Then we're going to subtract 3 times the 1st row from the 2nd row $$ \begin{bmatrix} 3 & 5 & 6 \end{bmatrix} - 3 \cdot \begin{bmatrix} 1 & 1 & 1 \end{bmatrix} = \begin{bmatrix} 0 & 2 & 3\end{bmatrix} $$ Updating each of them $$U = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 2 & 3 \\ -2 & 2 & 7 \end{bmatrix} $$ $$ L =\begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} $$ $$k=1,j=3,m=3 $$ $$\ell_{31} = \frac{u_{31}}{u_{11}} = \frac{-2}{1} = -2 $$ $$ u_{3,1:3} = u_{3,1:3} +2 \cdot u_{1,1:3} $$ Then we add two times the first row to the third row $$ \begin{bmatrix} -2 & 2 & 7 \end{bmatrix} + 2 \cdot \begin{bmatrix} 1 & 1& 1 \end{bmatrix} = \begin{bmatrix}0 & 4 & 9 \end{bmatrix} $$ Updating $$ U = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 2 & 3 \\ 0 & 4 & 9 \end{bmatrix} $$ $$ L =\begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ -2 & 0 & 1 \end{bmatrix} $$ $$ k=2, j=3,m=3 $$ $$ \ell_{32} = \frac{u_{32}}{u_{22}} = \frac{4}{2} = 2$$ We're subtracting out little blocks $$ u_{3,2:3} = u_{3,2:3} - 2 \cdot u_{2,2:3} $$ $$ \begin{bmatrix} 4 & 9 \end{bmatrix} - 2 \cdot\begin{bmatrix} 2& 3 \end{bmatrix} = \begin{bmatrix} 0 & 3 \end{bmatrix} $$ Updating $$ U = \begin{bmatrix} 1 & 1 & 1 \\ 0 & 2 & 3 \\ 0 & 0 & 3 \end{bmatrix} $$ $$ L =\begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ -2 & 2 & 1 \end{bmatrix} $$ It now terminates $$ A = LU $$ $$ \underbrace{\begin{bmatrix} 1 & 1 & 1 \\ 3 & 5 & 6 \\ -2 & 2 & 7 \end{bmatrix}}_{A} = \underbrace{\begin{bmatrix} 1 & 0 & 0 \\ 3 & 1 & 0 \\ -2 & 2 & 1 \end{bmatrix}}_{L} \underbrace{\begin{bmatrix} 1 & 1 & 1 \\ 0 & 2 & 3 \\ 0 & 0 & 3 \end{bmatrix}}_{U} $$

A more specific reason is the generalization of how this works enter image description here enter image description here

4
On

Let's add the first row $a_1$ of $A$, multiplied by $c\in\Bbb R$, to the $k$-th row $a_k$. That is, $$ A = \begin{pmatrix}a_1\\\vdots\\a_k\\\vdots\\a_n\end{pmatrix} $$ becomes $$ A' = \begin{pmatrix}a_1\\\vdots\\ca_1+a_k\\\vdots\\a_n\end{pmatrix} = \begin{pmatrix}1 & & & & \\ & \ddots & & & \\c & & 1 & & \\ & & &\ddots & \\ & & & & 1\end{pmatrix}\begin{pmatrix}a_1\\\vdots\\a_k\\\vdots\\a_n\end{pmatrix} = LA, $$ where in $L$ every blank space is a zero. That's all.

0
On

using only row replacements that add a multiple of one row to another row below it

This operation expressed as elementary matrix $E$ is a lower triangular matrix in this case ("to another row below it", $j > i$).

E.g. see here for how $E=L_{i,j}(m)$ looks like (identity matrix plus $m$ entry in row $j$, column $i$), which adds the $m$-th multiple of row $i$ to row $j$ if applied from the left to some matrix $A$. ($R_j' = R_j + m R_i$)

The inverse operation $E^{-1}$ is subtracting the $m$-th multiple of row $i$ to row $j$. ($R_j' = R_j - m R_i$), which is $L_{i,j}(-m)$. So the inverse is an elementary matrix as well, and a lower triangular matrix as well.

$L$ consists of the product of lower triangular matrices, $$ L = (E_p \dotsb E_1)^{-1} = E_1^{-1} \dotsb E_p^{-1} $$ so it is a lower triangular matrix as well.