How does a computer calculate the matrix norm

258 Views Asked by At

Consider the matrix $$ A = \begin{bmatrix}6&-1\\2&3\end{bmatrix} $$ We can calculate the eigen-decomposition by hand as $$ \begin{bmatrix}6&-1\\2&3\end{bmatrix} = \begin{bmatrix}1&1\\2&1\end{bmatrix} \begin{bmatrix}4&0\\0&5\end{bmatrix} \begin{bmatrix}-1&1\\2&-1\end{bmatrix} $$ It has two eigenvalues, 4 and 5. Wolfram calculates its operator norm precisely at $2\sqrt{10}$.

Question: How? Clearly Wolfram is reading this values off of some algebra having to do with the eigenvalues and eigenvectors.

The operator norm here is defined as $$ \|A\| = \sup_{\|x\|^2=1} \|Ax\| $$ If we instead looked at a symmetric matrix $B$ then its operator norm would coincide with its largest eigenvalue. This makes sense because a symmetric matrix has orthogonal eigenvectors, so any unit vector $x$ can be decomposed in terms of such eigenvectors and clearly you maximize $\|Ax\|$ by choosing to put all of the weight in $x$ in the eigenvector corresponding to the largest eigenvalue.

However, $A$ is not symmetric and its eigenvectors $[1 \ \ 2]^T$ and $[1 \ \ 1]^T$ are not orthogonal, the angle between them being $$ \theta = \arccos\left(\frac{3}{\sqrt{10}}\right) $$ The intuition should be that it is possible to make $\|Ax\|$ larger than both 4 and 5 by creating a unit vector $x$ that exploits this geometry somehow. If two eigenvectors are close to being collinear then we can form unit vectors by taking big linear combinations of them, which may result in a large value of $\|Ax\|$ if the eigenvalues are very different from one another.

I tried to set up the Lagrangian to maximize $$ \left\|A\left(\alpha \begin{bmatrix}1 \\ 2 \end{bmatrix} + \beta \begin{bmatrix}1 \\ 1 \end{bmatrix}\right)\right\|^2 $$ subject to $$ \left\|\alpha \begin{bmatrix}1 \\ 2 \end{bmatrix} + \beta \begin{bmatrix}1 \\ 1 \end{bmatrix}\right\|^2 = 1 $$ but it is quite messy. I have never seen a simple expression for the operator norm given for a diagonalizable matrix, but it seems as though there should be a simple closed form solution (especially in the 2-dimensional case) in terms of the eigenvalues and the angles $\theta$ between eigenvectors.

1

There are 1 best solutions below

0
On

The operator norm is the largest singular value, i.e. the square root of the largest eigenvalue of $A^TA$. In this case, $$A^TA = \begin{pmatrix} 6 & 2 \\ -1 & 3\end{pmatrix}\begin{pmatrix} 6 & -1 \\ 2 & 3\end{pmatrix} = \begin{pmatrix} 40 & 0 \\ 0 & 10\end{pmatrix} $$ so $\|A\| = \sqrt{40} = 2\sqrt{10}$.