Why take the maximum eigenvalue in computing the matrix $2$-norm?

490 Views Asked by At

We know that the matrix $2$-norm is defined as

$$\|A\|_2 := \sqrt{\lambda_{\max}(A^T A)}$$

Why do we consider the maximum eigenvalue of $A^T A$?

2

There are 2 best solutions below

0
On BEST ANSWER

For a self-adjoint matrix $T$ the Courant-minmax-principle says: $$\lambda_{j+1}=\operatorname{min}_{v_1,...,v_j}\operatorname{max}_{||v||=1\\ \langle v,v_1\rangle=...=\langle v, v_j\rangle=0}\langle Tv,v\rangle$$ where $\lambda_j$ are its eigenvalues in non-increasing order. Now put $T=A^*A$ where $A^*$ is the conjugate transpose ( or just the transpose if you are working over the reals), to get for the squares of the singular values $\sigma_j(A)$ in non-increasing order $$\sigma_{j+1}^2(A)=\operatorname{min}_{v_1,...,v_j}\operatorname{max}_{||v||=1\\ \langle v,v_1\rangle=...=\langle v, v_j\rangle=0}||Av||^2$$ since $\langle A^*Av,v\rangle=\langle Av,Av\rangle=||Av||^2$. Taking the square root and putting $j=0$ gives You $$\sigma_1(A)=\operatorname{max}_{||v||=1}||Av||=||A||_{2,2}$$ where $||.||_{2,2}$ denotes the operatornorm with respect to the usual $2$-norm

0
On

The square-roots of eigenvalues of $A^TA$ are commonly known as the singular values $\sigma$ of $A$. These singular values represent how a given set of vectors (known as right singular vectors) are mapped to a second given set of vectors (left singular vectors). These singular vectors (always normalized and always pairwise orthogonal) depend on the matrix and are often referred to by $u_i$ and $v_i$ and satisfy $$ Av_i=\sigma_i u_i. $$

The 2-norm of $A$ (written as $\|A\|_2$) is known as the largest scaling any unit-vector will experience by transformation with $A$. The direction is irrelevant (different from the definition of eigenvalues). As $u_1$ experience the longest scaling $\sigma_1$ in transformation, this is given as the 2-norm of $A$.