Given matrices $A,B$, minimize $\|UAV^T - B\|_F$ over orthogonal matrices $U, V$

837 Views Asked by At

(Question edited to shorten and clarify it, see the history for the original)

Suppose we are given two $n\times n$ matrices $A$ and $B$. I am interested in finding the closest matrix to $B$ that can be achieved by multiplying $A$ with orthogonal matrices. To be precise, the problem is

$$\begin{align} \min_{U,V}\ & \|UAV^T-B\|_F \\ \text{s.t.}\ & U^TU = I \\ & V^TV = I, \end{align}$$ where $\|\cdot\|_F$ is the Frobenius norm.

Without loss of generality*, we can restrict our attention to diagonal matrices with nonnegative diagonal entries $a_1,a_2,\ldots,a_n$ and $b_1,b_2,\ldots,b_n$. My hypothesis is that in this case the optimal $UAV^T$ is still diagonal, with its entries being the permutation of $a_i$ which minimizes $\sum_i (a_{\pi_i} - b_i)^2$. In other words, $U=V=P$, where $P$ is the permutation matrix corresponding to said permutation $\pi$. This appears to be true based on numerical tests, but I don't know how to prove it. Is there an elegant proof?


*For arbitrary $A$ and $B$, take their singular value decompositions $A=U_A\Sigma_AV_A^T$ and $B=U_B\Sigma_BV_B^T$ to obtain $$\begin{align} \|UAV^T-B\|_F &= \|UU_A\Sigma_AV_AV^T-U_B\Sigma_BV_B^T\|_F \\ &= \|U'\Sigma_AV'^T-\Sigma_B\|_F, \end{align}$$ where $U'=U_B^{-1}UU_A$ and $V'=V_B^{-1}VV_A$ are orthogonal. So we can work with $\Sigma_A$ and $\Sigma_B$ instead.

3

There are 3 best solutions below

0
On BEST ANSWER

We may assume that $A,B$ are non-negative diagonal. We calculate the extrema of the function $f:(U,V)\in O(n)^2\rightarrow tr((UAV^T-B)(VAU^T-B))$.

Note that $H_1$ is in the tangent space to $O(n)$ in $U$ iff $U^TH_1\in SK$ (it is skew), that is $H_1=UH$ where $H\in SK$.

Then $\dfrac{\partial f}{\partial U}(U,V):H\in SK\rightarrow 2tr(AV^T(VAU^T-B)UH)=0$ for all $H$. Then $AV^T(VAU^T-B)U=A(A-V^TBU)$ is symmetric.

In the same way, $\dfrac{\partial f}{\partial V}(U,V):K\in SK\rightarrow 2tr(AU^T(UAV^T-B)VK)=0$ for all $K$. Then $AU^T(UAV^T-B)V=A(A-U^TBV)$ is symmetric, that is $(A-V^TBU)A$ is symmetric.

(HYP). Assume that the singular values of $A$ are distinct and that the singular values of $B$ too.

Let $Z=A-V^TBU$. One has $AZ=Z^TA,ZA=AZ^T$. Then $Z$ is diagonal, that is $D=V^TBU$ is diagonal. Then $D^2=V^TB^2V=U^TB^2U$; it is easy to see that $U=V$ and are quasi-permutations (the entries are $\pm 1$). Thus the minimum can be obtained for $U=V$, a permutation; moreover, this choice is unique.

Now, if we don't assume(HYP), then there is one infinity of couples $(U,V)$ that reach the minimum. Yet, by a continuity reasoning, we can prove that there is a permutation $U=V$ that reaches the minimum. Indeed, the required minimum is a continuous function of $A,B$.

5
On

you want to minimize over $U$ an unitary matrix :

$$\|U A-B\|_F^2 = \sum_{n=1}^N \|U A_n - B_n\|^2$$

where $A_n,B_n$ are the rows of $A,B$. since any permutation matrix is unitary, you can also consider minimizing :

$$\sum_{n=1}^N \|U A_{\phi_n} - B_n\|^2$$ over $U$ and a permutation $\phi$. if $v,w$ are collinear vectors, then $$\min_U \|U v - w \|^2 = | \ \|v\|^2-\|w\|^2 \ |$$ and this stays true even when $u,v$ are not collinear. hence :

$$\|U A-B\|_F^2 \ge \sum_{n=1}^N | \ \|A_{\phi_n}\|^2-\|B_n\|^2 \ |$$

finally, when $A,B$ are diagonal matrices, the optimum is attained with :

$$\min_\phi \sum_{n=1}^N | \, A_{\phi_n,\phi_n}^2- B_{n,n}^2 |$$

which can be written with a signed permutation matrix $P$ representing the optimal permutation $\phi$ and the signs simplifying the absolute values :

$$\min_P \sum_{n=1}^N \| \, PA_{\phi_n}- B_{n}\|^2 $$

and by taking the SVD, we have the general solution even when $A,B$ are not diagonal.

(and I leave you seing how this answers to your question on $U = V$)

0
On

Independently of @loup blanc's answer, I found a more elementary partial solution showing that the $2\times2$ case implies the general $n\times n$ case.

The desired proposition is equivalent to the following: For any $n\times n$ matrix $A$ and diagonal $n\times n$ matrix $B$, a matrix $\tilde A=UAV^T$ which minimizes $\|\tilde A-B\|$ is diagonal.

Suppose the hypothesis is true for $2\times2$ matrices. In the $n\times n$ case, suppose $A$ is not diagonal, that is, it has some nonzero off-diagonal entry. We will show that $A$ cannot be optimal.

Without loss of generality, take the nonzero off-diagonal entry to lie in the upper left $2\times 2$ block. Divide $A$ and $B$ into blocks $$A = \begin{bmatrix}A_{11} & A_{12} \\ A_{21} & A_{22}\end{bmatrix}, \qquad B = \begin{bmatrix}B_{11} & 0 \\ 0 & B_{22}\end{bmatrix},$$ where the diagonal blocks are of size $2\times2$ and $(n-2)\times(n-2)$ respectively.

By assumption, the $2\times2$ problem with non-diagonal $A_{11}$ and diagonal $B_{11}$ has a solution $\tilde A_{11}=U_{11}A_{11}V_{11}^T$ such that $\tilde A_{11}$ is diagonal and $\|\tilde A_{11}-B_{11}\|^2<\|A_{11}-B_{11}\|^2$.

Consider orthogonal matrices $$U = \begin{bmatrix}U_{11} & 0 \\ 0 & I\end{bmatrix}, \qquad V = \begin{bmatrix}V_{11} & 0 \\ 0 & I\end{bmatrix}.$$ Then $$\begin{align} \|UAV^T-B\|^2 &= \left\|\begin{bmatrix}U_{11}A_{11}V_{11}^T & U_{11}A_{12} \\ A_{21}V_{11}^T & A_{22}\end{bmatrix} - \begin{bmatrix}B_{11} & 0 \\ 0 & B_{22}\end{bmatrix}\right\|^2 \\ &= \|U_{11}A_{11}V_{11}^T - B_{11}\|^2 + \|U_{11}A_{12}\|^2 + \|A_{21}V_{11}^T\|^2 + \|A_{22}-B_{22}\|^2 \\ &= \|\tilde A_{11} - B_{11}\|^2 + \|A_{12}\|^2 + \|A_{21}\|^2 + \|A_{22}-B_{22}\|^2 \\ &< \|A_{11} - B_{11}\|^2 + \|A_{12}\|^2 + \|A_{21}\|^2 + \|A_{22}-B_{22}\|^2 \\ &= \|A - B\|^2. \end{align}$$ Thus, $UAV^T$ has a better objective value than $A$, so the latter cannot be optimal.