Mysterious Proof about Induced Norms (was: Uniqueness of SVD)

811 Views Asked by At

In order to prove non-uniqueness of singular vectors when a repeated singular value is present, the book (Trefethen), argues as follows: Let $\sigma$ be the first singular value of A, and $v_{1}$ the corresponding singular vector. Let $w$ be another linearly independent vector such that $||Aw||=\sigma$, and construct a third vector $v_{2}$ belonging to span of $v_{1}$ and $w$, and orthogonal to $v_{1}$. All three vectors are unitary, so $w=av_{1}+bv_{2}$ with $|a|^2+|b|^2=1$, and $v_{2}$ is constructed (Gram-Schmidt style) as follows:

$$ {v}_{2}= \dfrac{{w}-({v}_{1}^{T} w ){v}_{1}}{|| {w}_{1}-({v}_{1}^{T} {w} ){v}_{1} ||_{2}}$$

Now, Trefethen says, $||A||=\sigma$, so $||Av_{2}||\le \sigma$ but this must be an equality (and so $v_{2}$ is another singular vector relative to $\sigma$), since otherwise we would have $||Aw||<\sigma$, in contrast with the hypothesis.

How that? I cannot see any elementary application of triangle inequality or Schwarz inequality to prove this claim.

I am pretty well convinced of partial non-uniqueness of SVD in certain situations. Other proofs are feasible, but I wish to undestand this specific algebraic step of this specific proof.

Thanks.

4

There are 4 best solutions below

2
On BEST ANSWER

This is actually very simple. The main point (seemingly missed by the OP) is that $Av_1$ and $Av_2$ must be orthogonal (this is something obvious to people familiar with proofs of SVD, because the induction decomposes into orthogonal spaces).

For every $z\in{\mathbb C}$, you have

$$ ||A(v_2+zv_1)||^2 \leq \sigma^2 ||v_2+zv_1||^2 \tag{1} $$

Expanding, one obtains

$$ ||Av_2||^2+2{\sf Re}\bigg(\bar{z}\big<Av_2,Av_1\big>\bigg)+|z|^2||Av_1||^2 \leq \sigma^2 (||v_2||^2+|z|^2||v_1||^2) \tag{2} $$

which simplifies to

$$ 2{\sf Re}\bigg(\bar{z}\big<Av_2,Av_1\big>\bigg) \leq \sigma^2||v_2||^2-||Av_2||^2 \tag{3} $$

Letting $z=\big<Av_2,Av_1\big>t$ with $t\in{\mathbb R}$, we deduce that $2t\Big|\big<Av_2,Av_1\big>\Big|^2 \leq \sigma^2||v_2||^2-||Av_2||^2$, and this is possible only if $\big<Av_2,Av_1\big>=0$. We then have, for any $a,b$ with $|a|^2+|b|^2=1$ and $b\neq 0$,

$$ \sigma^2||av_1+bv_2||^2-||A(av_1+bv_2)||^2= |b|^2 (\sigma^2||v_2||^2-||Av_2||^2) \tag{4} $$

So $||Aw||=\sigma||w||$ forces $||Av_2||=\sigma||v_2||$.

1
On

This is not an answer but too long for a comment. Suppose that $||Av_2||=\sigma$, then:

$$\sigma^2=||A(w)||^2=\left<A(w),A(w)\right>=\left<aA(v_1)+bA(v_2),aA(v_1)+bA(v_2)\right>=$$ $$=a^2||A(v_1)||^2+b^2||A(v_2)||^2+2ab\left<A(v_1),A(v_2)\right>=\sigma^2+2ab\left<A(v_1),A(v_2)\right>$$

But this is true when $A(v_1)$ and $A(v_2)$ are orthogonal; does this fact help?

4
On

Note: Please note that this answer was initially incorrect. Thanks to littleO who draw attention to my mistake. The essential argument at the end is now based on the answer already stated by Ewan Delanoy. So, in fact the proof from Trefethen remains mysterious for me too :-;


I'm not sure if the arguments of Trefethen regarding the part of the proof you are interested in are sufficient.

In order to be able to consider the text carefully, the following paragraph quotes verbatim the relevant part of the proof of Theorem 4.1 from his Numerical Linear Algebra.

From Numerical Linear Algebra, part of proof of Theorem 4.1 (Trefethen):

First we note that $\sigma_1$ is uniquely determined by the condition that it is equal to $\left\Vert A\right\Vert_2$, as follows from $(4.4)$. Now suppose, that in addition to $v_1$, there is another linearly independent vector $w$ with $\left\Vert w\right\Vert_2=1$ and $\left\Vert Aw\right\Vert_2=\sigma_1$. Define a unit vector $v_2$, orthogonal to $v_1$, as a linear combination of $v_1$ and $w$, $$v_2=\frac{w-\left(v_1^{\ast}w\right)v_1}{\left\Vert w_1-\left(v_1^{\ast}w\right)v_1\right\Vert_2}$$ Since $\left\Vert A\right\Vert_2=\sigma_1$, $\left\Vert Av_2\right\Vert_2\leq\sigma_1$; but this must be an equality, for otherwise, since $w=v_{1}c+v_{2}s$ for some constants $c$ and $s$ with $\left\vert c\right\vert^2+\left\vert s\right\vert^2=1$, we would have $\left\Vert Aw\right\Vert_2<\sigma_1$.

Let's analyse the arguments of Trefethen relatively detailed:

First step: Starting Situation

We know from the beginning of the proof (not stated here) that there is a vector $v_1$ with $\left\Vert v_1\right\Vert_2=1$ and we also set $\sigma_1=\left\Vert A\right\Vert_2$. According to the text above we further assume that this vector is a singular vector with $\left\Vert Av_1\right\Vert_2=\sigma_1$.

Second step: Uniqueness via indirect argument

We consider now (indirect argument) any vector $w$ which is linearly independent to $v_1$ and which fulfills in addition to $v_1$, $\left\Vert w\right\Vert_2=1$ and $\left\Vert Aw\right\Vert_2=\sigma_1$.

Third step (main idea of the proof): Create a singular vector $v_2$ violating the distinctness assumption of the singular values stated in the theorem

Based on the assumption that $\left\Vert Aw\right\Vert_2=\sigma_1$ we create a second singular vector $v_2$, which also fulfills $\left\Vert Av_2\right\Vert_2=\sigma_1$ and so violates the precondition of distinctness of the singular values.

Since $v_1$ and $w$ are linearly independent, we can create (e.g. with Gram-Schmidt) a vector $v_2$ with $\left\Vert v_2\right\Vert =1$ and which is orthogonal to $v_1$. Since all vectors $u$ with $\left\Vert u\right\Vert =1$ fulfill by definition (of the supremum) $\left\Vert Au\right\Vert_2 \leq \left\Vert A\right\Vert_2$, we get $\left\Vert Av_2\right\Vert_2 \leq \left\Vert A\right\Vert_2=\sigma_1$.

Now with the help of $w$ we can show that $v_2$ even has to fulfill $\left\Vert Av_2\right\Vert_2 = \left\Vert A\right\Vert_2=\sigma_1$, since if otherwise $\left\Vert Av_2\right\Vert_2 <\sigma_1$ we get

\begin{align} \left\Vert Aw_2\right\Vert_2^{2}&=\left\Vert A(c v_1+s v_2)\right\Vert^{2}_2\\ &\leq\left\vert c\right\vert^2\left\Vert Av_1\right\Vert_2^2+2\mathsf{Re}\left(c\bar{s}\left<Av_1,Av_2\right>\right)+\left\vert s\right\vert^2\left\Vert Av_2\right\Vert_2^2\\ &=\left\vert c\right\vert^2\sigma_1^2+\left\vert s\right\vert^2\underbrace{{\left\Vert Av_2\right\Vert_2^2}}_{<\sigma_1^2}\qquad(\ast)\\ &<\sigma_1^2\left(\left\vert c\right\vert^2+\left\vert s\right\vert^2\right)\\ &=\sigma_1^2 \end{align}

and this contradicts our assumption that $\left\Vert Aw_2\right\Vert_2=\sigma_1$.

So we get a second singular vector $v_2$ with $\left\Vert Av_2\right\Vert_2=\sigma_1$ and this violates the condition of distinct singular values stated in the theorem.


Please note, that in line ($\ast$) the essential argument from Ewan Delanoy which states that $Av_1$ and $Av_2$ are orthogonal and so the inner product vanishes is used. I don't see a proper argument in the proof from Trefethen, which I could use instead.

0
On

Trefethen's argument is indeed very strange.

As I understood the above answers, everything would be fine if one knew that $Av_1$ and $Av_1$ are orthogonal.

This can be done by observing that since $\sigma_1=\Vert A\Vert$ and $v_1$ is a unit vector such that $\Vert Av_1\Vert=\sigma_1$, then $v_1$ is an eigenvector for $A^*A$ with eigenvalue $\sigma_1^2$.

Indeed, we have $\sigma_1^2=\Vert Av_1\Vert^2=\langle A^*Av_1,v_1\rangle$, i.e. $\langle (\sigma_1^2Id-A^*A)v_1,v_1\rangle=0$. Since the sesquilinear form $B(u,v)=\langle (\sigma_1^2Id-A^*A)u,v\rangle$ is positive semi-definite (because $\sigma_1^2=\Vert A\Vert^2=\Vert A^*A\Vert$) it follows, by Cauchy-Schwarz's inequality applied to $B$, that $(\sigma_1^2Id-A^*A)v_1=0$, i.e. $A^*Av_1=\sigma_1^2v_1$.

The details for Cauchy-Schwarz are as follows: we have $B(v_1,v_1)=0$, hence by Cauchy-Schwarz: $$\vert B(v_1,u)\vert\leq B(v_1,v_1)^{^1/2}B(u,u)^{1/2}=0;$$ that is, $B(v_1,u)=0$ for all vectors $u$. This means that $(\sigma_1^2Id-A^*A)v_1$ is orthogonal to everybody, and hence is equal to $0$.

Once you know that $A^*Av_1=\sigma_1^2v_1$, simply write $$\langle Av_1,Av_2\rangle=\langle A^*Av_1,v_2\rangle=\sigma_1^2\langle v_1,v_2\rangle=0\, .$$

Having said that, it seems however much simpler to prove that $\Vert Av_2\Vert=\sigma_1$ as follows. By the above reasoning, $v_1$ and $w$ are both eigenvectors of $A^*A$ with eigenvalue $\sigma_1^2$. Hence, so is $v_2$, since $v_2$ is a linear combination of $v_1$ and $w$. So we have $\Vert Av_2\Vert^ 2=\langle A^*Av_2,v_2\rangle=\langle\sigma_1^2v_2,v_2\rangle=\sigma_1^2$, as required.

Edit Actually, I didn't read Ewan's answer carefully, since he's also proving that $Av_1$ and $Av_2$ are orthogonal! His argument is in fact the one that is used in the proof of Cauchy-Schwarz's inequality.