SVD of matrix vs. SVD of its transpose for m < n

1.3k Views Asked by At

Let $\mathbf{A}(m \times n)$ and $\mathbf{B}(m \times n)$ be two real-valued matrices. Minimizing $\| \mathbf{AQ} - \mathbf{B} \|^2_F$ with $\mathbf{Q}$ being a rotation matrix is achieved by

$$\mathbf{Q} = \mathbf{U} \mathbf{V}^T$$

with the SVD of $\mathbf{A}^T\mathbf{B}$ being

$$\mathbf{A}^T\mathbf{B} = \mathbf{UDV}^T \qquad \text{(case 1)}.$$

Generally, I would assume that we get the same results by using the transpose of $\mathbf{A}^T\mathbf{B}$, i.e.

$$\mathbf{B}^T\mathbf{A} = \mathbf{UDV}^T \qquad \text{(case 2)}$$

and calculating

$$\mathbf{Q} = \mathbf{V} \mathbf{U}^T$$

However, the two solutions are often different for $m < n$. An example (calculated in R)

$$ \mathbf{A}= \begin{pmatrix} 0.0 & -0.5 & -1.0 \\ 0.0 & 0.5 & 1.0 \\ \end{pmatrix} \mkern36mu \mathbf{B}= \begin{pmatrix} 0.5 & 0.5 & 0.0 \\ -0.5 & -0.5 & 0.0 \\ \end{pmatrix} $$

When calculating $\mathbf{Q}$ by factoring according to case 1 (subscript 1) we get

$$ \mathbf{Q_1}= \begin{pmatrix} -0.32 & 0.32 & 0.89 \\ -0.88 & 0.25 & -0.40 \\ -0.35 & -0.92 & 0.20 \\ \end{pmatrix} $$

For case 2 (subscript 2) we get

$$ \mathbf{Q_2}= \begin{pmatrix} 0.00 & 0.00 & 1.00 \\ -0.95 & 0.32 & 0.00 \\ -0.32 & -0.95 & 0.00 \\ \end{pmatrix} $$

Both solutions give the same result for the initial problem, as $\mathbf{AQ_1} = \mathbf{AQ_2}$ Yet, the rotations matrices are different.

$$ \mathbf{AQ_1}= \begin{pmatrix} 0.79 & 0.79 & -0.00 \\ -0.79 & -0.79 & 0.00 \\ \end{pmatrix} \mkern72mu \mathbf{AQ_2}= \begin{pmatrix} 0.79 & 0.79 & 0.00 \\ -0.79 & -0.79 & 0.00 \\ \end{pmatrix} $$

I do not really understand why this is the case and how the results relate to one another. I guess that $m < n$ is the key to it? Can someone explain this in a way, so it can be understood by a non-mathematician like me? ;)

1

There are 1 best solutions below

1
On BEST ANSWER

The problem does not need to have a unique solution if $A^TB$ is rank-deficient. To see this, note that $$ \|AQ-B\|_F^2=\|A\|_F^2+\|B\|_F^2-2\,\mathrm{tr}(B^TAQ), $$ so minimizing $\|AQ-B\|_F$ over orthogonal $Q$ is equivalent to maximizing $\mathrm{tr}(B^TAQ)$ over the same set. Take an SVD $USV^T=B^TA$ and set $Q=VWU^T$ for some orthogonal $W$ and note that $$\tag{1} \mathrm{tr}(B^TAQ)=\mathrm{tr}(SW)=\sum_{i=1}^{\mathrm{rank}(A^TB)}\sigma_i w_{ii}. $$ A matrix $W$ giving the maximum trace is obtained by setting the diagonal entries of $W$ corresponding to nonzero singular values in $S$ equal to one, so typically, a solution is found simply by setting $W=I$ giving the second solution formula in your question. However, note, that zero singular values have no contributions whatsoever to the trace so if $$ S=\pmatrix{S_+&0\\0&0} $$ with $S_+$ containing the nonzero singular values of $B^TA$, $$ W=\pmatrix{I&0\\0&\tilde{W}}, $$ where $\tilde{W}$ is an orthogonal matrix, is also a solution to the original minimization problem.