Efficiently Finding a Rotation that Makes a Set of Vectors with Arbitrary Entries Non-Negative

52 Views Asked by At

Given a set of $k$ vectors ${x_1, x_2, ..., x_k} \in \mathbb{R}^n$ with arbitrary entries, we seek an efficient algorithm to compute an orthogonal matrix $Q \in \mathbb{R}^{n \times n}$ that can rotate these vectors to a new set of vectors ${Qx_1, Qx_2, ..., Qx_k}$ that have non-negative entries in all their components. $k \leq n$. Note that the solution may not exist, and if it exists, it may not be unique.

Formally, I want to solve the following optimization problem:

\begin{equation*} \max_Q \min_{1 \leq i \leq k, 1 \leq j \leq n} (Qx_i)_j \quad \text{subject to } Q^TQ = I_n \end{equation*}

where $(Qx_i)_j$ denotes the $j$th entry of the vector $Qx_i$.

Unfortunately, this problem is not really easy to solve. So I am thinking of two other problems.

\begin{equation*} \min_Q - tr(QXJ) \quad \text{subject to } Q^TQ = I_n \end{equation*}

and

\begin{equation*} \min_Q - \mathrm{softmin}(-QX) \quad \text{subject to } Q^TQ = I_n \end{equation*}

where $J$ is an $k \times n$ matrix of all ones, $X$ is the $n \times k$ matrix representing all the vectors, and $log(*)$ denotes the element-wise logarithm.

The first problem aims to maximize the sum of all the entries in the matrix $QX$, which may not be equivalent to the original problem. The solution can be easily derived from the SVD of the matrix XJ. The second problem aims to maximize the soft minimum of $QX$, which may be difficult to solve due to the complexity of the objective function.

Any insights or ideas on how to approach these problems are welcome. Thank you.

  • Update:

For the softmin function, suppose we use LogSumExp to represent the smooth approximation of minimum. We can write $X = [x_1, x_2, \cdots, x_n]$ as a block matrix of $n$ $k$-vectors. Then $$-LSE(-QX) = -log \sum_{i,j} \exp(- e_j^T Q x_i),$$ whose matrix gradient in the tangent space of orthogonal group, the skew symmetric matrix, is simply $$\frac{\sum_{i,j} \exp(-e^T_j Q x_i) (x_i^T e_j - x_i e_j^T)}{2\sum_{i,j} \exp(-e^T_j Q x_i)},$$ and from that one can do QR to find out its retraction on the orthogonal group. Empirically, however, I find this method a bit too computation heavy. Any thoughts on other easier solutions will be much appreicated.

1

There are 1 best solutions below

0
On

If you are looking for a pragmatic solution, this can be expressed as an instance of quadratically constrained quadratic programming (QCQP), and then you could use an off-the-shelf QCQP solver. You'd have to experiment with it to see whether this yields a useful result in practice or not, as in the worst case, solving QCQP problems is NP-hard.

In particular, we obtain linear constraints $(Qx_i)_j \ge 0$ for each $i,j$, and quadratic constraints $(Q^\top Q)_{i,j} = 1_{i=j}$, thus this can be viewed as an instance of QCQP.


Another possible approach would be to write $Q=e^A$ where $A$ is a skew-symmetric matrix, then try to solve the optimization problem

$$\max_A \; \min_{i,j} \; (e^A x_i)_j,$$

where $A$ ranges over all skew-symmetric matrices. Notice that this eliminates the requirement for a constraint; it becomes an unconstrained optimization algorithm. Here we treat the elements of $A$ above the diagonal as the variables to optimize over. You could try using any standard optimization algorithm, such as gradient descent, to see how well it works on this problem. You can also try replacing the min with a softmin with a temperature that is annealed slowly as optimization proceeds.

(Strictly speaking, this captures the orthogonal matrices with determinant 1. You also need to separately handle the case of orthogonal matrices with determinant $-1$, but that is easy to take care of via a variation on the above.)