Finding the global maximum of sum-of-exponentials

262 Views Asked by At

Problem: find $R$ that maximizes the following: $$f(R) = \sum_{i} k_i \exp^{-y_i^T R x_i}\\ k_i \in \mathbb{R},\; y_i, x_i \in \mathbb{R}^2\\ R \in SO(2)\;\; (\text{i.e. 2D rotation matrix (2x2, orthogonal, det=1})$$

I'd like to understand how to solve the above problem. I've described the problem for 2 dimensions to keep it simple, but I believe that a solution would generalize to higher dimensions (please correct me if I'm wrong about this).

I would like to understand the different aspects of this problem:

(1) if there is a global minimum or not

(2) can the (global) solution be found in closed form (if so, how?)

(3) how to solve with gradient descent (or other technique), if no closed form solution.

I'm happy to provide clarification if the problem formulation is unclear (this is my first post so apologies in advance if I'm not following the guidelines properly)

Thanks!!

2

There are 2 best solutions below

3
On

Given $x_i$, $y_i$, and $k_i$, you can rewrite the expression for $f(R)$. The rotations in two dimensions are described by a matrix with a simple form. If you do a rotation by $\theta$, the rotation matrix is $$R(\theta)=\begin{pmatrix}\cos\theta&-\sin\theta\\\sin\theta & \cos\theta\end{pmatrix}$$ With this, $$f(R)=f(\theta)=\sum_ik_i\exp{[(y_{i1}x_{i1}-y_{i2}x_{i2})\cos\theta+(y_{i2}x_{i1}-y_{i1}x_{i2})\sin\theta]}$$ The above function is continuous in $\theta$, and has a periodicity of $2\pi$. Unless it is constant, it means that it has a minimum in the interval $[0,2\pi)$. That's the answer to question 1.

You can take the derivative with respect to $\theta$ and equate it to $0$. Unfortunately, it looks pretty complicated, with a summation of exponentials multiplied by trigonometric functions. I'm not sure it can be simplified in such a way to get a nice closed form solution. The nice thing is that you can write out this equation, and solve it numerically. My guess that even something simple might work fine. Look up Newton's method.

0
On

For the general problem $(n\geq 2)$:
Since the set of orthogonal matrices is compact, a global minimum exists.
A closed form solution is unlikely to exist. And there are probably multiple local minima in general.

You can find a minimum from a starting point using the the gradient descent or a similar method by considering the Frobenius inner product on $M_n(\mathbb R)$.
If your current position is $R$, the tangent space of $SO(n)$ at $R$ is the space $R\times \text{Skew}_n$,
which has an orthonormal basis $$\mathrm r_{ij} = \frac{R(\mathrm e_{ij} - \mathrm e_{ji})}{\sqrt 2} ;\; i < j$$ The gradient is then $\sum_{i,j} \nabla_{\mathrm r_{ij}} f(R) \mathrm r_{ij}$ where $\nabla_{\mathrm r_{ij}} f(R)$ is the directional derivative of $f$ at $R$ along $\mathrm r_{ij}$, which is relatively simple to compute in this case.