Calculating SVD by hand: resolving sign ambiguities in the range vectors.

13.2k Views Asked by At

When calculating the SVD of the matrix

$$A = \begin{bmatrix}3&1&1\\-1&3&1\end{bmatrix}$$

I followed these steps

$$A A^{T} = \begin{bmatrix}3&1&1\\-1&3&1\end{bmatrix} \begin{bmatrix}3&-1\\1&3\\1&1\end{bmatrix} = \begin{bmatrix}11&1\\1&11\end{bmatrix}$$

$$\det(A A^{T} - \lambda I) = (11-\lambda)^{2} - 1 = 0$$

Hence, the eigenvalues are $\lambda_{1} = 12$ and $\lambda_{2} = 10$.

When $\lambda_{1} = 12$:

$$ \begin{bmatrix}11-\lambda_{1}&1\\1&11-\lambda_{1}\end{bmatrix} \begin{bmatrix}x_{1}\\x_{2}\end{bmatrix} = \begin{bmatrix}0\\0\end{bmatrix}$$

$$x_{1} = x_{2} \implies u_{1} = \begin{bmatrix}t\\t\end{bmatrix}$$

And for $\lambda_{2} = 10$:

$$x_{1} = -x_{2} \implies u_{2} = \begin {bmatrix}t\\-t\end{bmatrix}$$

Now

$$U = \begin {bmatrix} u_{1}&u_{2} \end{bmatrix}$$

$u_{1}$ and $u_{2}$ are orthonormal. So the for $u_{1} = \begin{bmatrix}t\\t\end{bmatrix}$ , $u_{2} = \begin{bmatrix}t\\-t\end{bmatrix}$ I know $\left| t \right| = \frac{1}{\sqrt{2}}$ and $u_{1}.u_{2}=0$.

My question how can we decide about the sign?

For example I think both $U= \begin{bmatrix}\frac{1}{\sqrt{2}} &\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}} &\frac{-1}{\sqrt{2}} \end{bmatrix}$ and $U=\begin{bmatrix}\frac{-1}{\sqrt{2}} &\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}} &\frac{1}{\sqrt{2}} \end{bmatrix}$ could be answers. Then Which one should I choose?

======

Update1:

Based on answers posted I rewrite: $u_{1} = sgn (t_1) \begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix}$

$u_{2} = sgn (t_2) \begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{-1}{\sqrt{2}}\end{bmatrix}$

$$U= \begin{bmatrix} \frac{1}{\sqrt{2}}& \frac{-1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}& \frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix} \operatorname{sgn} (t_1)&0 \\ 0& \operatorname{sgn} (t_2) \end{bmatrix}$$

======

Update2:

I continued by calculation of $V$ as follow:

$ A^{T} A = \begin{bmatrix}3&-1\\1&\\1&1\end{bmatrix} \begin{bmatrix}3&1&1\\-1&3&1\end{bmatrix} = \begin{bmatrix}10&0&2\\0&10&4\\2&4&2\end{bmatrix}$

$det( A^{T} A- \lambda I)=0$

$\lambda_{1} = 12 , v_1 = sgn(t_3) \begin{bmatrix}t_{3}\\ 2t_{3} \\ t_{3} \end{bmatrix}$

$\lambda_{2} = 10 , V_{2} = sgn(t_4) \begin{bmatrix}t_{4}\\ -0.5t_{4} \\ 0 \end{bmatrix}$

$\lambda_{3} = 0 , V_{3} = sgn(t_5) \begin{bmatrix}t_{5}\\ 2t_{5} \\ -5t_{5} \end{bmatrix}$

$V= \begin{bmatrix}\frac{1}{\sqrt{6}} &\frac{2}{\sqrt{5}} &\frac{1}{\sqrt{30}}\\ \frac{2}{\sqrt{6}}&\frac{-1}{\sqrt{5}}&\frac{2}{\sqrt{30}}\\ \frac{1}{\sqrt{6}}& 0& \frac{-5}{\sqrt{30}}\end{bmatrix}\begin{bmatrix} \operatorname{sgn} (t_3)&0&0 \\ 0& \operatorname{sgn} (t_4)&0\\ 0&0& \operatorname{sgn} (t_5) \end{bmatrix}$

I try to check if all possible answers for U and V are valid :

$A = U\Sigma V^{*}$ $A = \begin{bmatrix} \frac{1}{\sqrt{2}}& \frac{-1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}& \frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix} \operatorname{sgn} (t_1)&0 \\ 0& \operatorname{sgn} (t_2) \end{bmatrix} \Sigma (\begin{bmatrix}\frac{1}{\sqrt{6}} &\frac{2}{\sqrt{5}} &\frac{1}{\sqrt{30}}\\ \frac{2}{\sqrt{6}}&\frac{-1}{\sqrt{5}}&\frac{2}{\sqrt{30}}\\ \frac{1}{\sqrt{6}}& 0& \frac{-5}{\sqrt{30}}\end{bmatrix} \begin{bmatrix} \operatorname{sgn} (t_3)&0&0 \\ 0& \operatorname{sgn} (t_4)&0\\ 0&0& \operatorname{sgn} (t_5) \end{bmatrix} )^{*} $ $A = \begin{bmatrix} \frac{1}{\sqrt{2}}& \frac{-1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}}& \frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix} \operatorname{sgn} (t_1)&0 \\ 0& \operatorname{sgn} (t_2) \end{bmatrix} \Sigma \begin{bmatrix} \operatorname{sgn} (t_3)&0&0 \\ 0& \operatorname{sgn} (t_4)&0\\ 0&0& \operatorname{sgn} (t_5) \end{bmatrix} \begin{bmatrix}\frac{1}{\sqrt{6}} &\frac{2}{\sqrt{5}} &\frac{1}{\sqrt{30}}\\ \frac{2}{\sqrt{6}}&\frac{-1}{\sqrt{5}}&\frac{2}{\sqrt{30}}\\ \frac{1}{\sqrt{6}}& 0& \frac{-5}{\sqrt{30}}\end{bmatrix}^{*} $

When I assigned $U= \begin{bmatrix}\frac{-1}{\sqrt{2}} &\frac{-1}{\sqrt{2}} \\\frac{-1}{\sqrt{2}} &\frac{1}{\sqrt{2}} \end{bmatrix}$ and $V= \begin{bmatrix}\frac{-1}{\sqrt{6}} &\frac{-2}{\sqrt{5}} &\frac{-1}{\sqrt{30}}\\ \frac{-2}{\sqrt{6}}&\frac{1}{\sqrt{5}} & \frac{-2}{\sqrt{30}} \\ \frac{-1}{\sqrt{6}}& 0& \frac{5}{\sqrt{30}}\end{bmatrix}$ and $\Sigma = \begin{bmatrix}\sqrt{20}&0&0\\ 0&\sqrt{10}&0\end{bmatrix} $in $A = U\Sigma V^{*}$ I got the A.

But when I updated U as $U = \begin{bmatrix}\frac{1}{\sqrt{2}} &\frac{1}{\sqrt{2}} \\\frac{1}{\sqrt{2}} &\frac{-1}{\sqrt{2}} \end{bmatrix}$, it produced -A. This probably means certain version of $U$ and $V$ will reproduce A. I haven't figured how should I choose them.

2

There are 2 best solutions below

2
On BEST ANSWER

The left singular vectors are

$$\mathrm{u}_1 \in \left\{ t_1 \begin{bmatrix} 1\\ 1\end{bmatrix} : t_1 \in \mathbb R \right\}$$

$$\mathrm{u}_2 \in \left\{ t_2 \begin{bmatrix} 1\\ -1\end{bmatrix} : t_2 \in \mathbb R \right\}$$

We want the left singular vectors to be orthonormal. They are already orthogonal. Normalizing,

$$\mathrm{u}_1 = \frac{t_1}{\sqrt{2 t_1^2}} \begin{bmatrix} 1\\ 1\end{bmatrix} = \operatorname{sgn} (t_1) \begin{bmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}}\end{bmatrix}$$

$$\mathrm{u}_2 = \frac{t_2}{\sqrt{2 t_2^2}} \begin{bmatrix} 1\\ -1\end{bmatrix} = \operatorname{sgn} (t_2) \begin{bmatrix} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}}\end{bmatrix}$$

where $\operatorname{sgn}$ denotes the signum function. Hence,

$$\mathrm U = \begin{bmatrix} | & |\\ \mathrm{u}_{1} & \mathrm{u}_{2}\\ | & |\end{bmatrix} = \begin{bmatrix} \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{2}}\end{bmatrix} \begin{bmatrix} \operatorname{sgn} (t_1) & 0\\ 0 & \operatorname{sgn} (t_2)\end{bmatrix}$$

There are $2^2 = 4$ possible choices.

0
On

This is a very nice post with an illuminating discussionon the confusion between sign choices and how they propagate.

Define the singular value decomposition of a rank $\rho$ matrix as $$ \mathbf{A} = \mathbf{U} \, \Sigma \, \mathbf{V}^{*}. $$

Connecting vectors between row and column spaces

The sign conventions can be understood by looking at the relationship between the $u$ and $v$ vectors $$ u_{k} = \sigma^{-1}_{k} \mathbf{A} v_{k}, \qquad v_{k} = \sigma^{-1}_{k} u^{*}_{k} \mathbf{A}, \qquad k = 1, \rho \tag{1} $$ Changing the sign of a $u$ vector induces a sign change in the corresponding $v$ vector; changing the sign of $v$ vector induces a sign change the corresponding $u$ vector.

Example matrix

We certainly agree the singular values are $\sigma = \left\{ 2\sqrt{3}, \sqrt{10} \right\}$. Let's nominate a set of eigenvectors.

$$ \mathbf{A}\,\mathbf{A}^{*}: \quad % u_{1} = % \frac{1}{\sqrt{2}} \left[ \begin{array}{r} 1 \\ 1 \\ \end{array} \right], \quad % u_{2} = % \frac{1}{\sqrt{2}} \left[ \begin{array}{r} -1 \\ 1 \\ \end{array} \right] $$ $$ \mathbf{A}^{*} \mathbf{A}: \quad % v_{1} = % \frac{1}{\sqrt{6}} \left[ \begin{array}{r} 1 \\ 2 \\ 1 \end{array} \right], \quad % v_{2} = % \frac{1}{\sqrt{5}} \left[ \begin{array}{r} -2 \\ 1 \\ 0 \end{array} \right] % v_{3} = % \frac{1}{\sqrt{30}} \left[ \begin{array}{r} -1 \\ -2 \\ 5 \end{array} \right] $$ How did we decide to distribute the minus signs in these vectors? Use the convention of making the first entries negative.

Stepping back, the eigenvectors have an inherent global sign ambiguity. For example $$ \begin{align} % \mathbf{A}^{*} \mathbf{A} v_{1} &= \mathbf{0} \\ % \mathbf{A}^{*} \mathbf{A} \left(-v_{1}\right) &= \left(- \mathbf{0} \right) = \mathbf{0} % \end{align} $$ Using this arbitrary set of column vectors, construct the default SVD: $$ \mathbf{A} = \left[ \begin{array}{cc} u_{1} & u_{2} \end{array} \right] % \Sigma \, % \left[ \begin{array}{ccc} v_{1} & v_{2} & v_{3} \end{array} \right]^{*} $$ If we select the sign convention $\color{blue}{\pm}$ on the $k-$th column vector, we induce the sign convention $\color{red}{\pm}$ on the $k-$th column vector in the complimentary range space. The $\color{blue}{choice}$ induces the $\color{red}{consequence}$.

Choices and consequences

For example, find the SVD by resolving the column space to produce the vectors $u$. The vectors $v$ will be constructed using the second equality in $(1)$. Flipping the global sign on $u_{1}$ flips the global sign on $v_{1}$. $$ \mathbf{A} = \left[ \begin{array}{cc} \color{blue}{\pm}u_{1} & u_{2} \end{array} \right] % \Sigma \, % \left[ \begin{array}{ccc} \color{red}{\pm}v_{1} & v_{2} & v_{3} \end{array} \right]^{*} $$ A nice feature of @Crimson's post is the demonstration that where can choose which range space to compute and which to construct. Compute $\mathbf{V}$ and construct $\mathbf{U}$, or compute $\mathbf{U}$ and construct $\mathbf{V}$.

The other path is to resolve the row space vectors $v$ and use the first equality in $(1)$ to construct the vectors $u$. The point is that flipping the global sign on $v_{1}$ flips the global sign on $u_{1}$ $$ \mathbf{A} = \left[ \begin{array}{cc} \color{red}{\pm}u_{1} & u_{2} \end{array} \right] % \Sigma \, % \left[ \begin{array}{ccc} \color{blue}{\pm}v_{1} & v_{2} & v_{3} \end{array} \right]^{*} $$ Choice and consequence swap places. The process for the first vector extends to all $\rho$ vectors in the range space.

Counting choices

The number of unique sign choices is $2^\rho$, two choices for each range space vector.

Here the matrix rank is $\rho = 2$ and there are $2^{2} = 4$ unique vector sets. If $\mathbf{A}\mathbf{A}^{*}$ is resolved, then: $$ \begin{array}{cc|cc} u_{1} & u_{2} & v_{1} & v_{2} \\\hline \color{blue}{+} & \color{blue}{+} & \color{red}{+} & \color{red}{+} \\ \color{blue}{+} & \color{blue}{-} & \color{red}{+} & \color{red}{-} \\ \color{blue}{-} & \color{blue}{+} & \color{red}{+} & \color{red}{+} \\ \color{blue}{-} & \color{blue}{-} & \color{red}{-} & \color{red}{-} \\ \end{array} $$ If instead $\mathbf{A}^{*}\mathbf{A}$ is resolved, then: $$ \begin{array}{cc|cc} v_{1} & v_{2} & u_{1} & u_{2} \\\hline \color{blue}{+} & \color{blue}{+} & \color{red}{+} & \color{red}{+} \\ \color{blue}{+} & \color{blue}{-} & \color{red}{+} & \color{red}{-} \\ \color{blue}{-} & \color{blue}{+} & \color{red}{+} & \color{red}{+} \\ \color{blue}{-} & \color{blue}{-} & \color{red}{-} & \color{red}{-} \\ \end{array} $$

Another example is in SVD and the columns — I did this wrong but it seems that it still works, why?