2-norm of transpose proof

113 Views Asked by At

I don't understand the proof of ‖x‖2=‖xT2.

I know you can use the Cauchy-Schwartz inequality and that you can prove it by first showing ‖xT2 ⩽ ‖x‖2 and then ‖x‖2 ⩽ ‖xT2. I get proving both inequalities implies that ‖x‖2=‖xT2.

Though I don't get how the Cauchy-Schwartz inequality is used to proves this. I would really appreciate getting a step by step on how you get to it, because I simply don't understand.

2

There are 2 best solutions below

2
On

First of all, we should understand exactly what the definition of $\|x^T\|_2$ is in this context. Whereas $\|x\|_2$ is simply the Euclidean norm of $x$, $\|x^T\|_2$ refers to the "operator norm" or "induced norm" of the matrix $x^T$. That is, $$ \|x^T\|_2 = \max_{\|y\|_2 = 1} \|x^Ty\|_2 = \max_{\|y\|_2 = 1} |x^Ty| $$ (see note below for $\|x^Ty\|_2$ vs $|x^Ty|$). The Cauchy-Schwarz inequality tells us that for any particular $y$ with $\|y\|_2 = 1$, we have $$ |x^Ty|_2 \leq \|x\|_2\,\|y\|_2 = \|x\|_2 \cdot 1 = \|x\|_2. $$ It follows that the maximum $\max_{\|y\|_2 = 1} |x^Ty|$ that we are interested in is at most equal to $\|x\|_2$. That is, $\|x^T\|_2 \leq \|x\|_2$.

As for the other direction of the inequality, I'm not sure how one can get this by using Cauchy-Schwarz again. One approach is simply to note that there is a unit vector $y$ such that $|x^Ty| = \|x\|_2$, meaning that $\max_{\|y\|_2 = 1} |x^Ty| \geq \|x\|_2$. To that end, we can consider $y = x/\|x\|_2$, and find that $$ |x^Ty| = \frac{x^Tx}{\|x\|_2} = \frac{\|x\|_2^2}{\|x\|_2} = \|x\|_2. $$ So, we can indeed conclude that $\|x^T\|_2 = \|x\|_2$.


Note: In the expression $\|x^Ty\|_2$, $x^Ty$ is a vector in $\Bbb R^1$ corresponding to the linear map from $\Bbb R^n$ to $\Bbb R^1$ that is encoded by $x$. In the expression $|x^Ty|$, we are considering the single entry of this entry of this vector as a number and taking the absolute value of this number. Whether these are meaningfully distinct is a matter of philosophy.

0
On

The question is a bit ambiguous, as $\|\cdot\|_2$ may denote the Euclidean norm of a row/column vector or the induced $2$-norm of a $1\times n$ or $n\times 1$ matrix. Anyway, for clarity, let us denote the Euclidean norm by $\|\cdot\|_E$ and the induced $2$-norm by $\|\cdot\|_2$. Now let $x$ be a column vector. We actually have $$ \|x^T\|_E\stackrel{(1)}{=}\|x\|_E\stackrel{(2)}{=}\|x\|_2\stackrel{(3)}{=}\|x^T\|_2. $$ So, the statement in your question is correct, regardless of whether the notation $\|\cdot\|_2$ in your question is interpreted as the Euclidean norm or induced $2$-norm.

Equality $(1)$ is trivial and its proof is omitted here. Equality $(2)$ is almost trivial: $$ \|x\|_2 =\max_{u\in\mathbb R^1,\ \|u\|_E=1}\|xu\|_E =\max_{u=\pm1}\|xu\|_E =\max_{u=\pm1}\|x\|_E =\|x\|_E. $$ Equality $(3)$ is a special case of a more general fact: for any $m\times n$ real matrix $A$, we have $\|A\|_2=\|A^T\|_2$. Here is the proof. First, observe that for any vector $w\in\mathbb R^m$, we have $$ \|w\|_E=\max_{v\in\mathbb R^m,\ \|v\|_E=1}\langle v,w\rangle $$ where the maximum occurs when $v$ is a unit vector point in the same direction as $w$. Now, let $u\in\mathbb R^n$ be the unit vector such that $\|A\|_2=\|Au\|_E$. If we take $w=Au$ in our previous observation, we obtain $$ \begin{aligned} &\|A\|_2 =\|Au\|_E =\max_{v\in\mathbb R^m,\ \|v\|_E=1}\langle v,Au\rangle =\max_{v\in\mathbb R^m,\ \|v\|_E=1}\langle A^Tv,u\rangle\\ &\le\max_{v\in\mathbb R^m,\ \|v\|_E=1}\|A^Tv\|_E\|u\|_E =\max_{v\in\mathbb R^m,\ \|v\|_E=1}\|A^Tv\|_E =\|A^T\|_2. \end{aligned} $$ Hence $\|A\|_2\le\|A^T\|_2$. By interchanging the roles of $A$ and $A^T$ in the argument above, we also obtain $\|A^T\|_2\le\|A\|_2$. Hence $\|A\|_2=\|A^T\|_2$.