What does the equality $\|Ax\|_2=\|A\|_2 \|x\|_2$ mean?

259 Views Asked by At

We know that the matrix norm $\|A\|_2$ is compatible with the euclidean vector norm $\|x\|_2$, i.e. $\|Ax\|_2 \leq \|A\|_2 \|x\|_2$ for all matrices $A$ and vectors $x$.

I am trying to understand what happens in the case that the equality is obtained, so $\|Ax\|_2=\|A\|_2 \|x\|_2$. Does there exist a (for example) geometric interpretation? Or something like, this equality does only hold for unitary matrices?

I am sorry about the confusion, I will try to clarify by splitting my question:

  1. Does $\|Ax\|_2 = \|A\|_2 \|x\|_2$ for all $x$ only hold for special matrices, e.g. unitary matrices?
  2. What happens geometrically, if $\|Ax\|_2 = \|A\|_2 \|x\|_2$ for all $x$?
  3. What happens geometrically, if $\|Ax\|_2 = \|A\|_2 \|x\|_2$ for a special $x$?
  4. What else does $\|Ax\|_2 = \|A\|_2 \|x\|_2$ for all $x$ tell you?
2

There are 2 best solutions below

3
On BEST ANSWER

The condition $\|Ax\|_2 = \|A\|_2 \|x\|_2$ occurs when $x$ is an eigenvector of $\sqrt{A^*A}$ of maximum eigenvalue $\sigma$. Eigenvalues of this matrix are known as singular values of $A$, and the largest such singular value actually turns out to be equal to $\|A\|_2$.

To understand this, you need to understand the polar decomposition of a matrix. It decomposes any complex matrix $A$ into $UP$, where $P$ is positive-semidefinite (indeed $P = \sqrt{A^* A}$), and $U$ is orthogonal.

So, what does this polar decomposition actually do? The matrix $P$ is positive-semidefinite, which means that it has an orthonormal basis of eigenvectors and non-negative real eigenvalues. This means the matrix $P$ will squish or stretch the unit ball in certain directions, by a factor of its eigenvalues (a.k.a. singular values of $A$).

For example, if the singular values of a $2 \times 2$ matrix are $0.5$ and $3$, then you would expect the unit ball to be shrunk along one axis by a factor of $0.5$, but expanded along another axis by a factor of $3$. This will produce an ellipse, with a major axis that is $3/0.5 = 6$ times longer than its minor axis. The trick is, these axes may not be the $x$ and $y$ axes, but you can say with confidence that these axes will be orthogonal to each other.

The matrix $U$ is orthogonal, which makes the transformation $x \mapsto Ux$ an isometry, meaning it doesn't mess with distances. Indeed, it doesn't mess with angles either. The matrix $U$ is then applied to our ellipse (note that, in the expression $UPx$, the $U$ matrix is applied second, after $P$ is applied!), which rotates or flips this ellipse. Note, it doesn't change the shape of the ellipse, nor does it grow or shrink. After applying $P$, then $U$, the result is the same as applying $A = UP$.

You can think about polar decomposition in much the same way as you think about polar form for complex numbers. The idea is to express a complex number by $re^{i\theta}$, where $r$ is like the positive-semidefinite factor $P$, and $e^{i\theta}$ is like the orthogonal matrix $U$. When you multiply a complex number by $re^{i\theta}$, the $r$ factor stretches or shrinks the complex number by some non-negative factor (well, by $r$, of course!), while the $e^{i\theta}$ performs a rotation (which is a type of isometry).

Anyway, the point is that the positive-semidefinite matrix $P$ is the one that actually determines the shape of the unit ball when mapped under $x \mapsto Ax$; the $U$ factor simply determines how it's oriented. That shape determines the operator norm of $A$, because it determines how far out the ellipse pokes out from the unit ball it came from. In our previous example, the ellipse was $3$ times larger on one axis, but half as large in the other. So, along that major axis, a vector grew by a factor of $3$, meaning that $Px = 3x$.

Note that no other vector, aside from $-x$, can grow that much under multiplication by $P$. If they did, they would be on the circle of radius $3$, but since we have an ellipse that isn't a circle, we would expect the circle and the ellipse to touch tangentially at two points: $x$ and $-x$. Every other vector grows, or possibly even shrinks, by a smaller factor.

Applying $U$ doesn't change the length of these vectors, so no matter what, $3$, our maximum singular value, is the greatest growth in length that multiplication by $UP = A$ can possibly produce. This holds in higher dimensions too: the greatest singular value is $\|A\|_2$, and so $\|Ax\|_2 = \|A\|_2\|x\|_2$ holds if and only if $x$ is an eigenvector corresponding to maximum singular value $\|A\|_2$.

So, what does it mean when $\|Ax\|_2 = \|A\|_2\|x\|_2$ for all $x$? It means the ellipse we get from mapping the unit circle under $A$ is just a circle of radius $\|A\|_2$. This means that $P = \sqrt{A^*A} = \|A\|_2 I$, i.e. just a scaling map. This means that $A$ is just a scalar times an orthogonal map $U$ (i.e. $A$ only rotates and scales uniformly and non-negatively).

Hope that helps!

2
On

Taking the example of a diagonalizable matrix, which is the case when $A$ is symmetric, the equality means that the direction of $x$ belongs to the eigenspace of the eigenvalue having the largest modulus.