Backward stable algorithm

2.6k Views Asked by At

Assume we have fixed unitary matrices $Q_1, \dots, Q_k \in \mathbb{C}^{m,m}$ and a matrix $A \in \mathbb{C}^{m,n}$ which can be perturbed. How can we proof that the algorithm on computing the product $Q_k \cdots Q_1 A$ from left to right is backward stable? And what if the $Q$-matrices are not unitary anymore?

2

There are 2 best solutions below

6
On BEST ANSWER

If the matrices $Q_i$ ($i=1,\ldots,k$) are unitary, then (assuming the product is "computed exactly"), we have for some perturbation $E\in\mathbb{C}^{m,n}$ of $A$ that $$ Q_k\cdots Q_1 (A+E) = Q_k\cdots Q_1 A + F, $$ where $F=Q_k\cdots Q_1 E$ and $\|F\|=\|Q_k\cdots Q_1 E\|=\|E\|$ for any unitarily invariant matrix norm $\|\cdot\|$. Hence the perturbation of the result $F$ has the same "size" as the perturbation $E$ of the input ($A$).

If the matrices $Q_i$ are not unitary, then simply $$ \|F\|=\|Q_k\cdots Q_1 A\|\leq\left(\prod_{i=1,\ldots,k}\|Q_i\|\right)\|E\|. $$ In finite precision, most well-behaving orthogonalisation algorithms (like modified, Householder, or Givens implementations of the Gram-Schmidt algorithm) produce the orthogonal factors such that $\|Q_i\|\approx 1$ up to a perturbation of the order given by the machine precision of the underlying finite precision arithmetic, so even "non-exactly" computed matrices $Q_i$ need not to harm the backward error of the computed matrix product. If the matrices $Q_i$ are arbitrary, then of course the norm of $F$ can be much larger than that of $E$ and there might be no guarantee of a backward stable result.

0
On

AlgebraicPavel has provided a proof. Here is some further discussion on the case when $Q$ is not unitary. For simplicity, I only consider the case $k=1$, i.e., $f(A)=QA$ where $Q$ is nonunitary but fixed.

Case 1: $Q$ is singular. Then in general $\tilde f(A)$ is not backward stable. For example, suppose $\text{rank}(Q) = 1$. Then theoretically $\text{rank}(QA)\leq1$. However, due to round-off errors, it is very likely that $\text{rank}(\tilde f(A))>1.$ Because there is no any perturbation $\delta A$ that could make $f(A+\delta A)=Q(A+\delta A)$ has rank higher than 1, we conclude that $\tilde f(A)$ is not backward stable in general.

Case 2: Q is nonsingular. Below is a proof that $\tilde f(A)$ will be backward stable. However, the proportionality constant will depend on the condition number of $Q$, denotes as $\kappa(Q)$.

Pf. Because $Q$ is nonsingular, we simply define $\delta A=Q^{-1}(\tilde f(A)-QA)$. The goal is to show that $\|\delta A\|=O(\epsilon)\|A\|$. Now, $$\|\delta A\|=\|Q^{-1}(\tilde f(A)-QA)\|\leq\|Q^{-1}\|\cdot\|\tilde f(A)-QA\|$$ Note that each $ij$ entry of $\tilde f$ is in the form of an inner product, which is known to be a backward stable operation. Denote the $i$th row of $Q$ as $q_i^*$, the $j$th column of $A$ as $a_j$. Then there exist $\delta q_i^*$ and $\delta a_j$ such that

$$\tilde f(A)_{ij}=(q_i+\delta q_i)^*(a_j+\delta a_j)$$

with $\|\delta q_i\|=\|q_i\|O(\epsilon)$ and $\|\delta a_j\|=\|a_j\|O(\epsilon)$. Hence

$$(\tilde f(A)-QA)_{ij}=\delta q_i^*a_j+q_i^*\delta a_j+\delta q_i^*\delta a_j$$

Using triangular and Cauchy-Schwarz inequalities, we have a bound for the $ij$ entry of the forward error:

$$|(\tilde f(A)-QA)_{ij}|\leq\|\delta q_i^*\|\|a_j\|+\|q_i^*\|\|\delta a_j\|+\|\delta q_i^*\|\|\delta a_j\|$$

$$=\left[\frac{\|\delta q_i^*\|}{\|q_i^*\|}+\frac{\|\delta a_j\|}{\|a_j\|}+\frac{\|\delta q_i^*\|\|\delta a_j\|}{\|q_i^*\|\|a_j\|}\right]\cdot\|q_i^*\|\|a_j\|\leq O(\epsilon)\|Q\|\|A\|$$

where in the last step I use the fact that $\|q_i^*\|\leq \alpha\|Q\|$ and $\|a_j\|\leq \beta\|A\|$ for any $i,j$ and some constants $\alpha,\beta$ that depend on the selected norms. Note that this bound is then independent of $i$ and $j$. Hence we have

$$ \|\tilde f(A)-QA\|\leq O(\epsilon)\|Q\|\|A\| $$

and thus

$$ \|\delta A\|\leq O(\epsilon)\|Q^{-1}\|\|Q\|\|A\|=O(\epsilon)\kappa(Q)\|A\|$$

This shows that when $Q$ is nonsingular, the matrix product $QA$ has backward stability, while the actual backward error can be very large if $Q$ is ill-conditioned.