Solving system of linear equations that consists of a few matrices

64 Views Asked by At

I have a task to write a Matlab program that get LU-decomposition of a given matrix
$A$ and then solves a system of linear equations using the obtained decomposition.

$Mz = f $, where $M = \left(\begin{array}[c c] - I & A\\ A^T & 0 \end{array}\right)$

Before I start with coding, I have a problem with the theoretical part of this task, mainly how to approach such an equation that is made of a few matrices.

1

There are 1 best solutions below

0
On BEST ANSWER

Let $2m$ denote the dimension of $M$. You are familiar with the case of $m=1$. Here $$ Mz: = \begin{bmatrix} 1 & a \\ a & 0 \end{bmatrix} \begin{bmatrix} z_1 \\ z_2 \end{bmatrix} = \begin{bmatrix} f_1 \\ f_2 \end{bmatrix} =:f$$ The symbols ":=" and "=:" indicate that we are defining objects, i.e., $M$ and $z$ as well as $f$. This linear system is equivalent to $$\begin{align} z_1 + az_2 &= f_1, \\ a z_1 + 0 \cdot z_2 &=f_2. \end{align}$$ In the case of $m>1$, the vectors $f$ and $z$ are still split in two, but the "components" are now vectors in $\mathbb{R}^m$. The linear system reads $$ Mz: = \begin{bmatrix} I & A \\ A^T & 0 \end{bmatrix} \begin{bmatrix} z_1 \\ z_2 \end{bmatrix} = \begin{bmatrix} f_1 \\ f_2 \end{bmatrix} =:f$$ where $z_i \in \mathbb{R}^{m}$ and $f_i \in \mathbb{R}^{m}$. This linear system is equivalent to $$\begin{align} z_1 + Az_2 &= f_1, \\ A^T z_1 + 0 \cdot z_2 &=f_2. \end{align}$$ Your matrix $M$ is special case of a block matrix/partitioned matrix. A more general discussion of how to multiply block/partitioned matrices can be found in this answer to this question.


Standard Gaussian elimination is an example of a scalar algorithm, because it operates on individual real numbers, i.e., scalars. There is a block variant of Gaussian elimination which is structurally identical to the scalar algorithm but operates on blocks. Let us consider your example. It is clear that $$\begin{align} z_1 + Az_2 &= f_1, \\ A^T z_1 + 0 \cdot z_2 &=f_2. \end{align}$$ implies $$\begin{align} A^T z_1 + A^TA z_2 &= A^T f_1, \\ A^T z_1 + 0 \cdot z_2 &=f_2. \end{align}$$ We can now eliminate $z_1$ and find that $$ - A^T A z_2 = f_2 - A^T f_1.$$ Once this system has been solved with respect to $z_2$ we can compute $z_1$ using $$z_1 = f_1 - Az_2.$$ There are a couple of points to mention. Above the horizontal line I treated $A$ as a square matrix. I deliberately sacrificed generality for the sake of clarity. Below the horizontal line, I am allowing $A$ to be an $m$ by $n$ matrix. This implies that $z_1 \in \mathbb{R}^m$ and $z_2 \in \mathbb{R}^n.$ I need $A^TA$ to be nonsingular, so $A$ must have full rank, i.e., I must have $m \ge n$. In other words, $A$ must be a tall matrix. Moreover, we should not form the matrix $A^TA$ because this matrix can be very ill-conditioned. Instead we can compute a QR factorization of $A$, i.e., $A = QR$, where $Q$ is orthgonal and $R$ is upper triangular. In reality, this is just the Gram-Schmidt algorithm applied to $A$. Then $A^T A = R^T Q^T Q R = R^T R$. We see that the Cholesky factorization of $A^TA = LL^T$ is immediately available to us with $L = R^T$.