Efficient solution for a quadratic + norm objective.

133 Views Asked by At

I want to minimize an objective function of the following form: $$ \begin{split} \text{Minimize} \quad & x^T D_x x + y^T D_y y + z^T D_z z + q_x^T x + q_y^T y + q_z^T z + \lambda \cdot \begin{Vmatrix} y - x \\ z - x \end{Vmatrix}_2 \end{split} $$

where $x, y, z \in R^n, \lambda > 0$ and $D \in R^{3n}$ is a positive-definite diagonal matrix. In addition, $n$ is "small" - less than 20.

I want to be able to solve it as efficiently as possible. I believe that there is a way to construct a dual that is a simple 1-D optimization problem over an interval that is quickly solvable, but I was not able to do so.

1

There are 1 best solutions below

1
On BEST ANSWER

Let's look at the generic problem $$\inf_x \tfrac{1}{2}x^TPx+q^Tx+\lambda\|Rx\|_2$$ We can use a little trick to quickly get a dual without going through the full Lagrangian motions: since $\|w\|_2=\sum_{\|v\|_2\leq 1}v^Tw$, we have $$\inf_x \sup_{\|w\|_2\leq 1} \tfrac{1}{2}x^TPx+q^Tx+w^TRx$$ Throwing caution to the wind (hey this isn't a journal article) we swap the $\inf$ and the $\sup$: $$\sup_{\|w\|_2\leq 1} \inf_x \tfrac{1}{2}x^TPx+q^Tx+w^TRx$$ Since $P$ is positive definite the inner problem has a unique minimizer: $$Px+q+R^Tw=0\quad\Longrightarrow\quad x = -P^{-1}(R^Tw+q)$$ Substitution yields $$\begin{aligned} &\sup_{\|w\|_2\leq 1} -\tfrac{1}{2}(R^Tw+q)^TP^{-1}(R^Tw+q)= -\inf_{\|w\|_2\leq 1} \tfrac{1}{2}\|P^{-1/2}(R^Tw+q)\|_2^2 \\&\qquad= -\inf_{\|w\|_2\leq 1} \tfrac{1}{2}w^TRP^{-1}R^Tw+q^TP^{-1}R^Tw+\tfrac{1}{2}q^TP^{-1}q \end{aligned}$$

This is easier to solve than the original problem, for sure.

In fact, it's the well-known trust region subproblem. Here's what we know about it. First, try computing the unconstrained minimizer $w=(RP^{-1}R^T)^{-1}RP^{-1}q$. If it happens to satisfy $\|w\|_2\leq 1$, you're done: that's your constrained optimum. Otherwise, we know that the solution $w$ is a solution of the problem $$\inf_w \|P^{-1/2}(R^Tw+q)\|_2^2+\bar{\lambda}\|w\|_2^2$$ where $\bar{\lambda}\geq 0$ is chosen so that $\|w\|_2=1$. For fixed $\bar{\lambda}$, the solution to this problem is just $(RP^{-1}R^T+\bar{\lambda}I)^{-1}RP^{-1}q$. You can use bisection to find the value of $\bar{\lambda}$.

If you're willing to compute the eigenvalue decomposition of $R^TP^{-1}R$ (or the SVD of $P^{-1/2}R$), you can solve it in $O(n)$ operations after that, I believe. But I don't know if you can readily exploit the structure of $P$ and $R$ here to speed up that SVD.