Find the farthest rotation matrix in $\mathrm{SO}(3)$ from a given matrix.

249 Views Asked by At

Consider the norm $\| A \| = \sqrt{\mathrm{tr}(AA^t)}$. It's easy to see that $\mathrm{SO}(3)$ is a compact subspace of $3 \times 3$ matrices in the topology induced by this norm because $\mathrm{O}(3)$ is compact and $SO(3)$ being the inverse image of $\{1\}$ under the map $\mathrm{det}$ is a closed subset of $\mathrm{O}(n)$. So, it makes sense to talk about the nearest rotation and the farthest rotation matrix from a given matrix. The former one, the nearest one, has been discussed online and I could find a lot of information about it by Googling. However, the farthest rotation matrix was not discussed. Out of curiosity, is it possible to find the farthest rotation matrix to a given matrix?

I tried to solve the problem using Lagrange multipliers but I didn't know how to proceed because I'm not good at matrix calculus.

3

There are 3 best solutions below

0
On BEST ANSWER

Finding the farthest matrix is actually not that different from finding the nearest matrix. The same techniques are used. It's just the conclusion that is different.

In general, suppose $A\in M_n(\mathbb R)$ and we want to maximise or minimise the Frobenius norm $\|A-R\|_F$ subject to $R\in SO(n,\mathbb R)$. Let $A=USV^T$ be a singular value decomposition and let $Q=U^TRV$. The value of the objective function is then equal to $\|S-Q\|_F$. By considering the squared Frobenius norm, we see the optimisation of $\|A-R\|_F$ is equivalent to the optimisation of $\operatorname{tr}(SQ)$.

Suppose $Q$ is a global optimiser of $\operatorname{tr}(SQ)$. The usual calculus argument shows that $SQ$ must be symmetric, i.e. $SQ=(SQ)^T=Q^TS$. Hence $S^2=(SQ)(Q^TS)=(Q^TS)(SQ)=(Q^TSQ)^2$ and (by the uniqueness of positive semidefinite square root) $S=Q^TSQ$. Thus $S$ commutes with $Q$ and the eigenspace corresponding to each eigenvalue of $S$ is an invariant subspace of $Q$.

For each nonzero eigenvalue of $S$, since the restriction of $S$ on the corresponding eigenspace is just a scaling operator, the condition that $SQ$ is symmetric means that the restriction of $Q$ on that eigenspace is symmetric too. If $S$ has a zero eigenvalue, since the restriction of $Q$ on the null space of $S$ does not affect the value of $\operatorname{tr}(SQ)$, we can also assume that the restriction of $Q$ on that null space is symmetric.

In other words, there exists a global optimiser of $\operatorname{tr}(SQ)$ such that $Q$ is symmetric. Therefore, by simultaneous orthogonal diagonalisation, we may assume that $Q$ is diagonal. As $Q$ is also real orthogonal, its diagonal entries must be $\pm1$.

The argument up to this point is the same no matter we want to maximise or minimise $\|A-R\|_F$. With the observation that the optimal $Q$ can be taken to be a diagonal orthogonal matrix, it is now obvious that the global maximum of $\|A-R\|_F$ subject to $R=UQV^T\in SO(n,\mathbb R)$ is given by \begin{aligned} R&=-U\operatorname{diag}\left(1,\ldots,1,\det(-UV^T)\right)V^T,\\ \|A-R\|_F&=\sqrt{\sum_{i=1}^{n-1}(s_i+1)^2+\left(s_n+\det(-UV^T)\right)^2}. \end{aligned} where $s_1\ge s_2\ge\cdots\ge s_n\ge0$ are the singular values of $A$. In contrast, the global minimum of $\|A-R\|_F$ subject to $R\in SO(n,\mathbb R)$ is given by \begin{aligned} R&=U\operatorname{diag}\left(1,\ldots,1,\det(UV^T)\right)V^T,\\ \|A-R\|_F&=\sqrt{\sum_{i=1}^{n-1}(s_i-1)^2+\left(s_n-\det(UV^T)\right)^2}. \end{aligned}

4
On

Since @Travis gave an elegant solution using geometry then we will solve the problem using Lagrange multiplier. WLOG,let us consider the function $f:M_{n\times n}(\mathbb{R})\rightarrow \mathbb{R}$ defined by \begin{align} f(A) = \|I-A\|^2 \end{align} subjected to the system of constraints \begin{align} g(A)=A^TA -I = 0. \end{align} Note that we are $f$ is a $n^2$-variable function and there are $\frac{n(n+1)}{2}$ constraint equation (i.e. $6$ when $n=3$).

More explicitly, we have the constraint equations \begin{align} g_{ij}(A)=a_{i1}a_{1j}+a_{i2}a_{2j}+a_{i3}a_{3j} - \delta_{ij}=0 \ \ \ \text{ for }\ \ 1\leq i\leq j \leq n. \end{align}

Now, we can write down the Lagrange function \begin{align} \mathcal{L}(A,\lambda) = \|I-A\|^2-\sum_{i, j}\lambda_{ij} g_{ij}(A) \end{align} where $\lambda$ is a symmetric $n\times n$ matrix. Note that we can rewrite $\mathcal{L}$ in the form \begin{align} \mathcal{L}(A,\lambda) = \|I-A\|^2-\operatorname{tr}(\lambda^T g(A)). \end{align} Finally, observe \begin{align} \nabla_{A, \lambda}\mathcal{L} =& \begin{pmatrix} 2I-A-A^T-(\lambda+\lambda^T)A^T\\ g(A) \end{pmatrix}\\ =& \begin{pmatrix} 2I-A-A^T-2\lambda A^T\\ A^TA-I \end{pmatrix} =\mathbf{0}. \end{align}

Solving the algebra yields \begin{align} A^2-2A+I+2\lambda = \mathbf{0} \end{align} or equivalently \begin{align} &A^2-2A=(A^T)^2-2A^T\\ &\implies \ \ A^T = A^3-2A^2+2I\\ &\implies \ \ A^4-2A^3+2A-I=(A-I)^3(A+I)=\mathbf{0} \\ &\implies \text{ eigenvalues of $A$ equals } \pm 1 \end{align} In particular, we have that \begin{align} \operatorname{tr}A \geq \begin{cases} -n & \text{ if } n \text{ even},\\ -(n-2) & \text{ if } n \text{ odd} \end{cases} \end{align} Note that \begin{align} f(A)=&\ \operatorname{tr}[(I-A)^T(I-A)]\\ =&\ \operatorname{tr}(2I-A-A^T)= 2(n-\operatorname{tr}(A))\\ \leq&\ \begin{cases} 4n & \text{ if } n \text{ even},\\ 4(n-1) & \text{ if } n \text{ odd}. \end{cases}. \end{align} Moreover, $f$ attains the maximum value. In the case $n$ even, we can take $A=-I$. In the case $n$ odd, we take the matrix \begin{align} A = \begin{pmatrix} 1 & 0\\ 0 & -I_{(n-1)\times(n-1)} \end{pmatrix}. \end{align}

0
On

Edit OP has clarified since this answer was posted that they are interested in finding for any matrix $A \in M(n, \Bbb R)$ the matrix in $SO(3)$ from which it is farthest. This answer addresses the special case of the problem when $A$ itself is in $SO(3)$. See user1551's post for a nice solution that covers the more general case.

Hint Since multiplication by orthogonal matrices preserves $||\cdot||$, if $B$ is the matrix in $SO(n)$ furthest from a given matrix $A \in SO(n)$, then $B A^{-1}$ is the matrix in $SO(n)$ furthest from $I$, and vice versa.

Thus, it suffices to find which matrix $C \in SO(n)$ achieves the maximum of $$d(I, C)^2 = ||I - C||^2 = \operatorname{tr}[(I - C)^T (I - C)] = \operatorname{tr}(2 I - C - C^T) = 2 (n - \operatorname{tr} C) ,$$ that is, we want to minimize $\operatorname{tr} C$.

Since $C \in SO(n)$, the eigenvalues $\lambda_i$ must satisfy $1 = \det C = \prod \lambda_i$. Likewise, since $C$ is real, any nonreal eigenvalues come in complex conjugate pairs. So for $n = 3$, the eigenvalues of $C$ are $e^{i \theta}, e^{-i \theta}, 1$ for some $\theta$, and so $\operatorname{tr} C = 1 + 2 \cos \theta$. This is minimized for $\theta = -1$, that is, eigenvalues $-1, -1, 1$. But the orthogonal matrices with these eigenvalues are exactly the rotations by a half-turn about some axis---there is one such rotation for each axis, so there are an $\Bbb R P^2$'s worth of these---and for these matrices $d(I, C) = 2 \sqrt{2}$. Similar arguments give a sharp upper bound $d(I, C) \leq 2 \sqrt{2 \left\lfloor \frac{n}{2} \right\rfloor}$ for all $n$.