In the Cholesky decomposition, the argument of the square root is always positive if the matrix is real and positive definite. Why?

462 Views Asked by At

Computing the Cholesky decomposition for an $n \times n$ matrix $A$ you need to evaluate

$$l_{jj} = \sqrt{a_{jj}-\sum_{k=1}^{j-1} l^2_{jk}}$$

The argument of the square root is always positive if $A$ is real and positive definite. Why is that the case?

1

There are 1 best solutions below

1
On BEST ANSWER

Let's prove this fact by induction.

Since $A$ is positive definite then $a_{11} > 0$. It follows from the fact that $a_{11} = \mathbf e_1^\top A \mathbf e_1 > 0$. Here $\mathbf e_1$ stands for the first column of the identity matrix. This is the base case $r = 1$ of the induction.

Suppose that we have successfully decomposed the upperleft $r \times r$ block of the matrix $A$ and the $a_{jj}-\sum_{k=1}^{j-1} l_{jk}^2$ was always positive for $j=1,\dots,r$. The relation between $l_{ij}$ and $a_{ij}$ is given by $$ l_{jj} = \sqrt{a_{jj} - \sum_{k=1}^{j-1} l_{jk}^2}, \quad j = 1,\dots,r\\ l_{ij} = \frac{1}{l_{jj}} \left(a_{ij} - \sum_{k=1}^{j-1} l_{ik} l_{jk}\right), \quad j=1,\dots,r-1;\; i=j+1, \dots, r. $$ Under the assumption we can safely square the first relation and multiply the second with $l_{jj} > 0$: $$ l_{jj}^2 = a_{jj} - \sum_{k=1}^{j-1} l_{jk}^2 \Leftrightarrow a_{jj} = \sum_{k=1}^{j} l_{jk}^2\\ l_{ij}l_{jj} = a_{ij} - \sum_{k=1}^{j-1} l_{ik} l_{jk} \Leftrightarrow a_{ij} = \sum_{k=1}^{j} l_{ik} l_{jk}. $$ There relations represent matrix equality $$ A^{(r)} = L^{(r)} \left[L^{(r)}\right]^\top. $$ Here $(r)$ marks upper left submatrix of $r \times r$ size. In other words leading $r\times r$ submatrix of $L$ is Cholesky decomposition of the leading $r\times r$ submatrix of $A$.

Let's start by naming different parts of $A^{(r+1)}$ and $L^{(r+1)}$: $$ A^{(r+1)} = \begin{pmatrix} A^{(r)} & \mathbf v\\ \mathbf v^\top & w \end{pmatrix}\\ L^{(r+1)} = \begin{pmatrix} L^{(r)} & \mathbf 0\\ \mathbf t^\top & \ast \end{pmatrix}\\ $$ I've marked the $l_{r+1,r+1}$ as $\ast$ since its existence was not proven yet. On the other hand $l_{r+1,k}, k=1,\dots,r$ are well defined by $$ l_{r+1, k} = \frac{1}{l_{k,k}} \left( a_{r+1, k} - \sum_{m=1}^{k-1} l_{r+1, m} l_{k, m} \right) $$ Rewriting the same using vectors $\mathbf v$ and $\mathbf t$ gives $$ t_k = \frac{1}{l_{k,k}} \left( v_k - \sum_{m=1}^{k-1} t_m l_{km} \right) \Leftrightarrow v_k = \sum_{m=1}^{k-1} t_m l_{km} + t_k l_{kk} = \sum_{m=1}^k t_m l_{km}. $$ The latter is elementwise form of $\mathbf v = L^{(r)} \mathbf t$. Let's now express $a_{r+1,r+1} - \sum_{k=1}^r l_{r+1,k}^2$: $$ a_{r+1,r+1} - \sum_{k=1}^r l_{r+1,k}^2 = w - \sum_{k=1}^r t_k^2 = w - (\mathbf t, \mathbf t). $$ Since $A$ is positive defined for any vector $\mathbf z$ of length $r$ $$ 0 < Q(\mathbf z) \equiv \begin{pmatrix} \mathbf z^\top & 1 \end{pmatrix} \begin{pmatrix} A^{(r)} & \mathbf v\\ \mathbf v^\top & w \end{pmatrix} \begin{pmatrix} \mathbf z \\ 1 \end{pmatrix} = \mathbf z^\top A^{(r)} \mathbf z + 2 (\mathbf v, \mathbf z) + w. $$ I would like find such $\mathbf z$ so $Q(\mathbf z)$ is exactly $w - (\mathbf t, \mathbf t)$. Let's rewrite $Q(\mathbf z)$ as $$ Q(\mathbf z) = \mathbf z^\top A^{(r)} \mathbf z + 2 (\mathbf v, \mathbf z) + w = \mathbf z^\top L^{(r)} \left[L^{(r)}\right]^\top \mathbf z + 2 (L^{(r)} \mathbf t, \mathbf z) + w = \\ = \mathbf z^\top L^{(r)} \left[L^{(r)}\right]^\top \mathbf z + 2 (\mathbf t, \left[L^{(r)} \right]^\top\mathbf z) + w. $$ The matrix $L^{(r)}$ is a triangular matrix with positive diagonal. Its determinant is nonzero since it is equal to product of the diagonal entries. Thus $L^{(r)}$ is invertible and we can take $\mathbf z = -\left[L^{(r)}\right]^{-\top} \mathbf t$ (the solution of $\left[L^{(r)}\right]^\top \mathbf z = -\mathbf t$). Plugging that $\mathbf z$ into $Q(\mathbf z)$ gives $$ Q\left(-\left[L^{(r)}\right]^{-\top} \mathbf t\right) = (\mathbf t, \mathbf t) - 2 (\mathbf t, \mathbf t) + w = w - (\mathbf t, \mathbf t). $$ Since $A$ is positive definite we know that $Q(\cdot) > 0$. On the other hand $w - (\mathbf t, \mathbf t)$ is exactly the expression under the root in the definition of $l_{r+1, r+1}$.

This finishes proof for the induction step.