Dual problem of minimizing a quadratic objective function with an infinity norm inequality constraint

1.8k Views Asked by At

In Banerjee et al. (2008) the authors derived from (eq 4) : $$\operatorname{min}_{y}\{y^TWy : \|y-S_j\|_{\infty}\leq \lambda \} $$ the dual problem (eq 5) $$ \operatorname{min}_x x^TW^{-1}x -S_j^Tx+\lambda\|x\|_1$$ where $S_j$ is a vector, $W\succ 0$. I can't find this dual problem from the primal. I know that the dual of $\ell_{\infty}$ norm is $\ell_1$ but it didn't help me much. Any idea ? Thank you.

3

There are 3 best solutions below

2
On BEST ANSWER

I believe the easiest way is directly using Lagrangian duality. Adding an auxiliary variable $z = y - S_j$, your problem becomes:

$$ \begin{aligned} \text{mininize} &\quad y^T W y \\ \text{subject to} &\quad ||z||_\infty \leq \lambda \\ &\quad z - y + S_j = 0 \end{aligned} $$

Using only the linear constraint for the Lagrangian, we have: $$ \begin{aligned} L(y, z; x) &= y^T W y + x^T(z - y + S_j) \\ &= y^T W y - x^T y + x^T S_j + x^T z \end{aligned} $$

Minimizing the Lagrangian above yields: $$ \begin{aligned} q(x) &= \inf_{y,z} \{ L(y, z; x) : ||z||_\infty \leq \lambda \} \\ &= \inf_y \{ y^T W y - x^T y \} + \inf_z \{ x^T z : ||z||_\infty \leq \lambda \} + x^T S_j \\ &= \inf_y \{ y^T W y - x^T y \} - \sup_z \{ -x^T z : ||z||_\infty \leq \lambda \} + x^T S_j \\ &= \inf_y \{ y^T W y - x^T y \} - \lambda \sup_z \{ (-x)^T (\tfrac{1}{\lambda} z) : ||\tfrac{1}{\lambda}z||_\infty \leq 1 \} + x^T S_j \end{aligned} $$ The infimum can be computed by comparing the gradient to zero, since it is a minimization of a convex quadratic function, and it yields $$-\frac{1}{4} x^T W^{-1} x$$ The supermum is precisely the definition of the dual norm, and thus it yields $$||-x||_1 = ||x||_1$$ So the dual problem is: $$ \text{maximize} \quad q(x) := -\frac{1}{4} x^T W^{-1} x - \lambda ||x||_1 + x^T S_j $$ When cast as a minimization problem, it is almost what you wanted, up to the constant $\tfrac{1}{4}$ before the quadratic term. Since in the paper they only want to say something about the structure of the problem , and thus the constant is not really important, I believe that this small technical error does not make any different with respect to their claim in the paper.

1
On

Here's my attempt in finding the dual problem. You will notice that its formulation is slightly different from the one stated in the paper. This means that I might be doing something wrong in the derivation, however, I thought I should post this answer just to show one approach for finding the dual.

Expressing the primal problem as (with $S_j$ written as $s$ for clarity)

$$ \begin{align} \text{minimize } & y^TWy\\ \text{subject to } & y-s-\lambda1\leq 0\\ & -y+s-\lambda1 \leq 0 \end{align} $$ where $1$ is the all-ones vector, the dual problem can be written as

$$ \begin{align} \text{maximize } & \inf_y \mathcal{L}(y,\mu_1,\mu_2)\\\ \text{subject to } & \mu_1\geq 0\\ & \mu_2 \geq 0 \end{align} $$

with optimization variables $\mu_1$, $\mu_2$, where the Lagrangian $\mathcal{L}$ equals

$$ \mathcal{L} \triangleq y^TWy + \mu_1^T(y-s-\lambda1)+\mu_2^T(-y+s-\lambda1). $$

By standard differential calculus, it follows that $\mathcal{L}$ is minimized w.r.t. $y$, for $y=y^*=\frac{1}{2}W^{-1}(\mu_2-\mu_1)$, which leads to the explicit dual problem formulation

$$ \begin{align} \text{maximize } & -\frac{1}{4}(\mu_2-\mu_1)^TW^{-1}(\mu_2-\mu_1)+s^T(\mu_2-\mu_1) -\lambda 1^T(\mu_2-\mu_1) - 2\lambda 1^T \mu_1\\ \text{subject to } & \mu_1\geq 0\\ & \mu_2 \geq 0 \end{align} $$

With the optimization variable transformation $\mu_2 = x + \mu_1$, the above can be written as (writing the objective as a minimization problem)

$$ \begin{align} \text{minimize } & \frac{1}{4}x^TW^{-1}x-s^Tx +\lambda \|x\|_1 + 2\lambda \| \mu_1\|_1\\ \text{subject to } & \mu_1\geq 0\\ & x+\mu_1 \geq 0 \end{align} $$

Now, notice that $\mu_1$ affects only the last term of the objective function, which is clearly minimized for $\mu_1 = 0$. This, in turn, simplifies the dual problem as

$$ \begin{align} \text{minimize } & \frac{1}{4}x^TW^{-1}x-s^Tx +\lambda \|x\|_1 \\ \text{subject to } & x\geq 0\\ \end{align} $$

1
On

I personally find it easiest to use conic formulations to get to the dual form. In this case, we have: \begin{array}{ll} \text{minimize} & y^TWy \\ \text{subject to} & (y - S_j, \lambda) \in \mathcal{K}_{\ell_1} \end{array} where $\mathcal{K}_{\ell_1}$ is the epigraph of the $\ell_1$ norm, a cone. The Lagrangian is $$L(y,x,s) = y^TWy - \langle (y - S_j, \lambda), (x, s) \rangle = y^TWy - (y-S_j)^T x - \lambda s$$ where $(x,s)$ lies in the dual cone $\mathcal{K}_{\ell_1}^*=\mathcal{K}_{\ell_\infty}$. The optimality condition on $y$ is $2Wy - x = 0$, or $y = \frac{1}{2} W^{-1} x$. So the dual function is $$g(x,s) = \frac{1}{4} x^TW^{-1} x - \frac{1}{2} x^TW^{-1} x + S_j^T x - \lambda s = - \frac{1}{4} x^TW^{-1} x + S_j^T x - \lambda s$$ The dual problem is $$\begin{array}{ll} \text{maximize} & g(x,s) = \inf_y L(y,x,s) \\ \text{subject to} & \|x\|_1 \leq s \end{array}$$ Substituting, we have $$\begin{array}{ll} \text{maximize} & - \frac{1}{4} x^TW^{-1} x + S_j^T x - \lambda s \\ \text{subject to} & \|x\|_1 \leq s \end{array}$$ Eliminating $s$ would give us the closest result to the one offered. It is not clear to me why the authors chose to express the negative of the dual; the proper dual is necessarily a maximization.