Box-constrained orthogonal matrix

629 Views Asked by At

Given constants $\ell, u \in \mathbb{R}^{3 \times 3}$ and the following system of constraints in $P \in \mathbb{R}^{3 \times 3}$ $$ P^T P = I_{3 \times 3},\quad \ell_{ij} \leq P_{ij} \leq u_{ij}, $$ I would like to find a matrix $P$ which satisfies this system, or determine that it is infeasible. Is there a computationally efficient way to perform this task?

The solution doesn't have to be closed form, but it should be an algorithm implementable on a computer which runs quickly, exploiting the fact that it is an $9$ dimensional problem.

A numerical algorithm which converges to a solution is also a very good option, but there should be a proof that indeed it converges to a solution of the problem.

3

There are 3 best solutions below

3
On

We have a system of quadratic equations and linear inequalities in $\mathrm X \in \mathbb R^{3 \times 3}$

$$\begin{aligned} \rm X^\top X = \rm I_3\\ \rm L \leq X \leq \rm U\end{aligned}$$

where $\rm L \leq U$, of course. The convex hull of the orthogonal group $\mathrm O (3)$ is defined by $\mathrm X^\top \mathrm X \preceq \mathrm I_3$, or, equivalently, by the inequality $\| \mathrm X \|_2 \leq 1$. We then have the following (convex) feasibility problem

$$\begin{array}{ll} \text{minimize} & 0\\ \text{subject to} & \rm L \leq X \leq \rm U\\ & \| \mathrm X \|_2 \leq 1\end{array}$$

In order to avoid the zero matrix solution, let us look for a solution on the boundary of the (convex) feasible region. Hence, we generate a random matrix $\mathrm C \in \mathbb R^{3 \times 3}$ and minimize $\rm \langle C, X \rangle$ instead

$$\begin{array}{ll} \text{minimize} & \langle \mathrm C, \mathrm X \rangle\\ \text{subject to} & \rm L \leq X \leq \rm U\\ & \| \mathrm X \|_2 \leq 1\end{array}$$


Numerical experiment

Suppose we would like to find a $3 \times 3$ matrix that is orthogonal, whose entries on the main diagonal are in $[-0.95, 0.95]$, and whose entries off the main diagonal are in $[-0.5, 0.5]$.

Using NumPy to randomly generate matrix $\rm C$ and CVXPY to solve the convex program,

from cvxpy import *
import numpy as np

n = 3

C = np.random.rand(n,n)

I = np.identity(n)
J = np.ones((n,n))

L = -0.95 * I + (-0.5 * (J - I))
U =  0.95 * I + ( 0.5 * (J - I))

X = Variable(n,n)

# define optimization problem
prob = Problem( Minimize(trace(C.T * X)), [ L <= X, X <= U, norm(X,2) <= 1 ] )

# solve optimization problem
print prob.solve()
print prob.status

# print results
print "X = \n", X.value
print "Spectral norm of X = ", np.linalg.norm(X.value,2)
print "Round(X.T * X) = \n", np.round((X.value).T * (X.value), 1)

which outputs the following

-2.04689293246
optimal
X = 
[[-0.94511744  0.20404426 -0.2650367 ]
 [-0.30732554 -0.84243613  0.44563059]
 [ 0.13860874 -0.499968   -0.85597357]]
Spectral norm of X =  1.00521536292
Round(X.T * X) = 
[[ 1. -0. -0.]
 [-0.  1. -0.]
 [-0. -0.  1.]]

I ran the Python script a few (maybe $5$) times until I obtained a matrix $\rm X$ that is (approximately) orthogonal. In other words, the algorithm does not produce such nice results for all choices of $\rm C$.

8
On

We use the following pseudo Newton iteration, that gives very quickly an orthogonal matrix.

$V_{k+1}=0.5(V_k+V_k^{-T})$.

Indeed, let $f:V\rightarrow V^TV-I_n$ and $Df_V:H\rightarrow H^TV+V^TH$. The standard Newton iteration is $V_{k+1}=V_k-(Df_{V_k})^{-1}(f(V_k))$, but $Df_{V_k}$ is not invertible; yet $H=0.5(V_k-V_k^{-T})$ is a solution of $Df_{V_k}(H)=f(V_k)$. For this solution $V_{k+1}=0.5(V_k+V_k^{-T})$.

A trial is composed with the following $3$ steps.

Step 1. We choose an element of $[L,U]$: $V_0=kL+(1-k)U$ where $k$ has the form $i/10,i=0,\cdots,10$.

Step 2. We use the above iterative algorithm with initial point $V_0$ and with $10$ iterations.

Step 3. If the obtained limit is in the box, then we stop. Otherwise, we continue with the next value of $k$.

The time of calculation for $1000=10^3$ trials is $17"$. There are on average $5$ failures for $10000=10^4$ trials.

Below, a copy of the procedure in Maple (the result is in $V$).

restart;
with(LinearAlgebra);
 Digits := 20;

roll := rand(1 .. 3); 

n := 3; id := IdentityMatrix(n):

 ve := Vector(11);

 vv := Vector([.5, .6, .4, .7, .3, .8, .2, .9, .1, 1, 0]);


 t:=time(): 
 for l from 1 to 10000 do

  K:=RandomMatrix(n,n,shape=skewsymmetric):

  U:=evalf((id-K).(id+K)^(-1)): 

 Z1:=Matrix(3,3,(i,j)->0.3*roll()):Z2:=Matrix(3,3,(i,j)->-0.1*roll()): 

  U1:=U+Z1:U2:=U+Z2:

test := [10, 10]; co := 0;
while test<>[0,0] and co<11 do  

co:=co+1:k:=vv[co]:

    V:=k* U1+(1-k)* U2:  
for i to 10 do 

V := .5*(V+1/Transpose(V)) end do;

test:=[0,0]: 

R1:= U+Z1-V:R2:=U+Z2-V: 

 for i from 1 to 3 do 

 for j from 1 to 3 do  

if R1[i,j]<0  then test:=test+[1,0]: fi:  

 if R2[i,j]>0  then test:=test+[0,1]: fi:   

 od:od: 

ve[co]:=test: 

if co=11  and test<>[0,0] then print(ve[7..11],U,Z1,Z2); fi: 

  od:  od:  

time()-t;

EDIT. Case 1. There is some solutions in the box and we want to find an approximation of one such solution. A Maple specialist advised me to use the command NPSolve; unfortunately, NPSolve only works (in general) if you give it an initial point not too far from a solution. Yet we can do as follows:

STEP 1. As in my above procedure, we pick some point $V0=kL+(1−k)U$ where k has the form $i/10,i=0,⋯,10$ on the segment $[L,U]$ and we project it in $V_1$ on $O(3)$. In a few cases, none of the $11$ points $V_1$ is in the box; yet, some are very close to the edge of the box.

STEP 2. We use NLPSolve (this soft accepts only the closed boxes) with $V_1$ as initial point; then the software uses the tangent vector space of $O(3)$ in $V_1$ and, with a gradient method, pushes this point towards the edge of the box. In general we obtain by this method a point $V_2$ that is very close to the edge of the box and we are done. For 20000=2.10^4 trials (as in my above procedure), I did not suffer any failure (on the $11$ points $V_1$, NLPSolve always works for many of them)!!

Of course there is no proof that the method is without fail. One must even be able to find a counter-example by making sure that there is only one solution (on the edge of the box). Of course, the OP did not take a lot of risk by offering a big bonus for rigorous proof. He will be forgiven because 500000 dollars is the price of a nice house.

CASE 2. There are no solutions in the box and we want to prove this fact. Except if you find a condition in the form $x[i,j]>1$, I think we cannot avoid a formal proof. In these conditions, I do not see any other methods than the Gröbner's basis method. Over $\mathbb{C}$, there are black boxes that do the job when the number of relations is not too large; unfortunately, it is known that in general the problem is much more complicated in the real case; in particular, it is necessary to use gradient methods. Anyway, I don't know how to do the job with Maple.

3
On

Here is one suggestion. You can eliminate the $P^TP=I$ constraint as follows.

$P^TP=I$ means that $P$ is a rotation matrix. Every rotation matrix can be written as $\exp([\omega])$, where $\exp$ is the matrix exponential and $$[\omega]=\left( \begin{array}{ccc} 0 & -a_3 & a_2 \\ a_3 & 0 & -a_1 \\ -a_2 & a_1 & 0 \\ \end{array} \right)$$ is a skew symmetric matrix. Also, from here (Matrix exponential of a skew symmetric matrix) you can find the matrix exponential analytically as: $$x = \sqrt{a_1^2+a_2^2+a_3^2}; \quad P=\exp([\omega])= I + \dfrac{\sin x}{x}[\omega] + \dfrac{1-\cos x}{x^2}[\omega]^2.$$

What I can think of is to perform a brute-force search on $a_1$, $a_2$, and $a_3$ to see if any value for them results in a $P$ that satisfies your bounds constraints.