Different condition numbers of $\begin{pmatrix} a & b \\ b & c \end{pmatrix}$

338 Views Asked by At

Let $a,b,c \in \mathbb{R}$ and $A := \begin{pmatrix} a & b \\ b & c \end{pmatrix}$ and $\det(A) \neq 0$. Find the condition number with respect to the 1-, 2- and $\infty$-norm and discuss for which relationships among $a,b,c$ the corresponding linear systems are well conditioned.

What I've done so far

We have $A^{-1} = \frac{1}{ac - b^2}\begin{pmatrix} c & -b \\ - b & a\end{pmatrix}$ and, therefore, \begin{align} \kappa_1(A) & = \| A \|_1 \| A^{-1} \|_1 = \max\{a + b, b + c\} \cdot\max\left\{\frac{c - b}{ac - b^2}, \frac{a - b}{ac - b^2} \right\} \\ & = \big( b + \max(a, c)\big) \cdot \frac{1}{ac - b^2}\big( \max(a, c) - b\big) \\ & = \frac{b^2 - (\max(a, c))^2}{b^2 - ac} \end{align} Note: The above steps only work for $\det(A) > 0$ and $a,c > b$ and are the same for the $\| \cdot \|_1$-norm, since the matrix is symmetric.

Now, I'm not sure to find out for which values of $a,b,c$ this would be small (other than that for $a = c$, it's 1 and for $a = b = c$ and $b = a \ge c$ it's undefined) and especially not how to find this out which out plotting it with a computer. While plotting I noticed that when |b| gets large, the term is very close to one for a sufficiently large plane around zero (i.e for $|b| = 60$ the term is very similar to 1 for all $(x,y) \in [-25,25]^2$).

For $a = -c$, the term becomes $\frac{b^2 - c^2}{b^2 + c^2} = \frac{2 b^2}{b^2 + c^2} - 1$, because $\max(-c,c) = |c|$.

Also, you can use \begin{align}\tag{$\star$} \label{eq:star} \max(x,y) = \frac{x + y + |x - y|}{2} \end{align} to obtain $$ \kappa_1(A) = \frac{1}{2}\big( 2b + a + c + |a - c|\big) \left( \frac{a - b}{ac - b^2} + \left| \frac{2c - a - b}{ac - b^2} \right| \right) $$


Similar problems arise for the next norm: We have $\kappa_2 = \sqrt{\rho(A^T A) \rho((A A^T)^{-1})}$ and $A^T A = A A^T = A^2$. Now, we have \begin{gather} A^2 = \begin{pmatrix} a^2 + b^2 & b(a + c) \\ b(a + c) & b^2 + c^2 \end{pmatrix} \\ (A A^T)^{-1} = (A^{2})^{-1} = \frac{1}{(b^2 - ac)^2} \begin{pmatrix} b^2 + c^2 & -b(a + c) \\ -b(a + c) & a^2 + b^2 \end{pmatrix}, \end{gather} where $\rho(B)$ is the largest eigenvalue of $B$. Now we obtain the eigenvalues by finding roots of the [characteristic polynomial][1]: \begin{gather*} \det(A^2 - t \cdot I_2) = (a^2 + b^2 - t)(b^2 + c^2 - t) - b^2(a + c) \overset{!}{=} 0 \\ \implies t_{1,2} = \frac{a^2 + 2b^2 + c^2 \pm(a + c) \sqrt{(a - c)^2 + 4b^2}}{2} \end{gather*} and, [similarly][2] $$ \rho(A^{-2}) = \max\left\{ \frac{a^2 + 2b^2 + c^2 + \pm (a + c) \sqrt{(a - c)^2 + 4b^2}}{2(b^2 - ac)^2} \right\} $$ So, \begin{align} \kappa_2(A) & = \sqrt{\max\left\{\frac{a^2 + 2b^2 + c^2 \pm(a + c) \sqrt{(a - c)^2 + 4b^2}}{2} \right\} \cdot \max\left\{ \frac{a^2 + 2b^2 + c^2 \pm (a + c) \sqrt{(a - c)^2 + 4b^2}}{2(b^2 - ac)^2} \right\} } \\ & = \frac{1}{| b^2 - ac |} \max\left\{ \frac{a^2 + 2b^2 + c^2 \pm (a + c) \sqrt{(a - c)^2 + 4b^2}}{2} \right\} \\ & \overset{\eqref{eq:star}}{=} \frac{a^2 + 2b^2 + c^2 + |a + c|\sqrt{(a - c)^2 + 4b^2}}{2| b^2 - ac |} \end{align} How do I simplify this result?
2

There are 2 best solutions below

0
On BEST ANSWER

Below follows an alternative analysis which does not rely on Gershgorin's theorem.


We begin the analysis with a sequence of elementary steps designed to reduce the number of variables and simplify the necessary calculations.

In general, we have $$\|A\|_\infty = \|A^T\|_1.$$ If $A$ is symmetric and non-singular, then $A^{-1}$ is also symmetric. It follows that $$ \kappa_\infty(A) = \kappa_1(A).$$ Since our matrix $$A = \begin{bmatrix} a & b \\ b & c \end{bmatrix}$$ is symmetric, we are therefore free to concentrate on, say, the 2-norm and the infinity norm.

We will distinguish between the case of $b=0$ and $b \not = 0$. If $b=0$, then $$ \|A\|_2 = \|A\|_\infty = \max \{|a|,|c|\}, \|A^{-1}\|_2 = \|A^{-1}\|_\infty = \max \{|a|^{-1},|c|^{-1}\}.$$ It follows, that $$ \kappa_2(A) = \kappa_\infty(A) = \max\left\{ \frac{|a|}{|c|}, \frac{|c|}{|a|} \right\}$$ In the case of $b = 0$, it is clear that $A$ is well-conditioned precisely when $|a| \approx |c|$ and ill conditioned when $|a| \ll |c|$ or $|c| \ll |a|$.

In the case of $b \not = 0$ we can without loss of generality assume that $b = 1$. If $b \not = 1$, then we simply scale the matrix with $b^{-1}$. The condition numbers are invariant under this scaling because $(b^{-1}A)^{-1} = b A^{-1}$. We can therefore concentrate on the case of $b=1$ where $$A = \begin{bmatrix} a & 1 \\ 1 & c \end{bmatrix}.$$ This matrix is non-singular if and only if $ac \not =1$. In this case, we have $$A^{-1} = \frac{1}{1 - ac} \begin{bmatrix} c & -1 \\ -1 & a \end{bmatrix}.$$ It follows, that $$ \|A\|_\infty = \max\{1 + |a|,1+|c|\}, \quad \|A^{-1}\|_\infty = \frac{1}{|1 - ac|}\max\{1 + |a|,1+|c|\},$$ which implies $$ \kappa_\infty(A) = \frac{1}{|1 - ac|}\max\{(1 + |a|)^2,(1+|c|)^2\}$$ A contour plot of the right hand side could now be obtained. A detailed understanding can be developed by covering the punctured plane $\mathbb{R}^2 - \{(0,0)\}$ with hyperbolas, i.e., curves of the form $$ac = \gamma$$ where $\gamma \not = 0$. On the curve corresponding to a $\gamma \not = 1$ we have $$ \kappa_\infty(A) \ge \frac{1}{|1 - \gamma|} \left( 1 + \sqrt{|\gamma|}\right)^2$$ with equality achieved when $|a|=|c| = \sqrt{|\gamma|}$. Moreover, $$ \kappa_\infty(A) \rightarrow \infty, $$ as $$\max\{|a|,|c|\} \rightarrow \infty, \quad ac=\gamma.$$ This covers the analysis of the infinity norm.

Explicit calculation of the 2-norm condition number is tedious, but a short-cut is possible because of the equivalence of norms. In general, we have $$ \|x\|_\infty \leq \| x\|_2 \leq \sqrt{n} \|x\|_\infty $$ This implies that $$ \kappa_2(A) \leq n \kappa_\infty(A) \leq n^2 \kappa_2(A)$$ Why? We have $$ \|Ax\|_2 \leq \sqrt{n} \|Ax\|_\infty \leq \sqrt{n} \|A\|_\infty \|x\|_\infty \leq \sqrt{n} \|A\|_\infty \|x\|_2 $$ This implies that $$ \|A\|_2 \leq \sqrt{n} \|A\|_\infty$$ Similarly, we have $$ \|Ax\|_\infty \leq \|Ax\|_2 \leq \|A\|_2 \|x\|_2 \leq \sqrt{n} \|A\|_2 \|x\|_\infty.$$ This implies that $$ \|A\|_\infty \leq \sqrt{n} \|A\|_2.$$ In our case $n=2$, so $$ \kappa_2(A) \leq 2 \kappa_\infty(A) \leq 4 \kappa_2(A)$$ or equivalently $$ \frac{1}{2} \kappa_\infty(A) \leq \kappa_2(A) \leq 2 \kappa_\infty(A). $$ In other words, when it comes to the conditioning of the matrix $A$ there is little to be learned from the 2-norm which cannot be discerned from the infinity norm.

0
On

This is not the complete analysis that the instructor requested, but a simplification which can be used to verify certain aspects of the solution.

It is important to note that a matrix is either well-conditioned or it is ill-conditioned. In particular, it does not matter which norm we use to compute the condition number. This follows from the fact that all induced matrix norms are equivalent.

The 2-norm condition number of the matrix $A$ can be computed from the eigenvalues of the matrix $A^TA$. Specifically, we have $$\kappa_2(A) =\sqrt{\frac{\lambda_{\max}(A^TA)}{\lambda_{\min}(A^TA)}}.$$ We have already seen that $$A^TA = \begin{bmatrix} a^2+b^2 & b(a+c) \\ b(a+c) & a^2 + c^2 \end{bmatrix}.$$ Now by Gershgorin's circle theorem and fact that $A^TA$ is symmetric positive semidefinite, the eigenvalues of $A^TA$ can be found in the interval $$I = [a^2+b^2 - |b(a+c)|,a^2+b^2 + |b(a+c)|] \cap [0, \infty).$$ We observe that $0$ can be an eigenvalue only when $$a^2 + b^2 - |b(a+c)| \leq 0.$$ If $a^2 + b^2 - |b(a+c)| > 0$, then the condition number is bounded by $$ \kappa_2(A) = \sqrt{\frac{a^2 + b^2 + |b(a+c)|}{a^2 + b^2 - |b(a+c)|}} = \sqrt{\frac{ 1 + \frac{|b(a+c)|}{a^2+b^2}}{1 - \frac{|b(a+c)|}{a^2+b^2}}} $$ We conclude that the matrix $A$ is necessarily well-conditioned when $$ |b(a+c)| \ll a^2+b^2$$ and that it can be ill-conditioned when $$|b(a+c)| \geq a^2+b^2$$ A few sanity check are in order. If $c=-a \not = 0$, then our matrix is orthogonal and perfectly conditioned. If $a=b=c$, then our matrix is singular.


It is possible to say with certainty that the matrix is ill-conditioned when $$|b| < \min\{|a|,|c|\} \ll \max\{|a|,|c|\}.$$ In this situation, Gershgorin's circles for $A$ are disjoint and it follows that one eigenvalue of $A$ is substantially smaller (in absolute value) than the other. This in turn implies that $A$ is ill-conditioned.