I'm trying to prove the equivalence of the following two definitions of the spectral norm
- It is the largest singular value of $\pmb A$, or equivalently, it is the square root of the largest eigenvalue of the product $\pmb A^\top\pmb A$.
- It is the maximum, over all nonzero vectors $\pmb{x} \in \mathbb R^n$, of the quotients $\frac{\| \pmb A\pmb{x} \|}{\| \pmb{x} \|}$ where $\| \cdot \|$ denotes the Euclidean norm.
Here's my proof
From (1), we can get that the spectral norm of $\pmb A$ is the square root of the largest solution $\lambda$ to $\pmb A^\top\pmb A\pmb x=\lambda \pmb x$. Multiplying both sides of the equation by $\pmb x^\top$ and take the square, we have
$$ \begin{align} \sqrt{\pmb x^\top\pmb A^\top\pmb A\pmb x}&=\sqrt{\pmb x^\top\lambda \pmb x}\\ &=\sqrt{\lambda \pmb x^\top\pmb x}\\ \Rightarrow\sqrt \lambda&=\sqrt{\pmb x^\top\pmb A^\top\pmb A\pmb x\over \pmb x^\top \pmb x}\\ &=\frac{\| \pmb A\pmb{x} \|}{\| \pmb{x} \|}\tag 1 \end{align} $$ Therefore, we have $\sigma(\pmb A)=\max \sqrt\lambda=\max \frac{\| \pmb A\pmb{x} \|}{\| \pmb{x} \|}$, where $\pmb x\in \mathbb R^n, \pmb x\ne \pmb0$, which gives us $(2)$.
I have a question with the above proof. In the above proof, I implicitly constrain $\pmb x$ to be an eigenvector of the linear transform. However, in $(2)$, it takes the maximum over all nonzero $\pmb x\in\mathbb R^n$. Therefore, I think my proof is insufficient.
How can I prove the equivalence?
By the spectral theorem, $A^T A$ has an orthogonal basis of eigenvectors $v_1, \dots v_n$ with nonnegative real eigenvalues $\lambda_1 \ge \dots \ge \lambda_n$. This means if we write a vector $v = \sum c_i v_i$ in terms of this basis we can write
$$ \| Av \|^2 = \langle Av, Av \rangle = \langle A^T Av, v \rangle = \langle \sum \lambda_i c_i v_i, \sum c_i v_i \rangle = \sum \lambda_i c_i^2;$$
in other words, the basis $v_i$ diagonalizes the quadratic form $\langle Av, Av \rangle$. So the operator norm is the maximum of $\sum \lambda_i c_i^2$ subject to the constraint $\| v \|^2 = \sum c_i^2 = 1$, and now it's very easy to see that the maximum is given by $c_1 = 1, c_i = 0, i \ge 2$ (this is a linear function subject to a linear constraint in $c_i^2$).