In Banerjee et al. (2008) the authors derived from (eq 4) : $$\operatorname{min}_{y}\{y^TWy : \|y-S_j\|_{\infty}\leq \lambda \} $$ the dual problem (eq 5) $$ \operatorname{min}_x x^TW^{-1}x -S_j^Tx+\lambda\|x\|_1$$ where $S_j$ is a vector, $W\succ 0$. I can't find this dual problem from the primal. I know that the dual of $\ell_{\infty}$ norm is $\ell_1$ but it didn't help me much. Any idea ? Thank you.
Dual problem of minimizing a quadratic objective function with an infinity norm inequality constraint
1.8k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
Here's my attempt in finding the dual problem. You will notice that its formulation is slightly different from the one stated in the paper. This means that I might be doing something wrong in the derivation, however, I thought I should post this answer just to show one approach for finding the dual.
Expressing the primal problem as (with $S_j$ written as $s$ for clarity)
$$ \begin{align} \text{minimize } & y^TWy\\ \text{subject to } & y-s-\lambda1\leq 0\\ & -y+s-\lambda1 \leq 0 \end{align} $$ where $1$ is the all-ones vector, the dual problem can be written as
$$ \begin{align} \text{maximize } & \inf_y \mathcal{L}(y,\mu_1,\mu_2)\\\ \text{subject to } & \mu_1\geq 0\\ & \mu_2 \geq 0 \end{align} $$
with optimization variables $\mu_1$, $\mu_2$, where the Lagrangian $\mathcal{L}$ equals
$$ \mathcal{L} \triangleq y^TWy + \mu_1^T(y-s-\lambda1)+\mu_2^T(-y+s-\lambda1). $$
By standard differential calculus, it follows that $\mathcal{L}$ is minimized w.r.t. $y$, for $y=y^*=\frac{1}{2}W^{-1}(\mu_2-\mu_1)$, which leads to the explicit dual problem formulation
$$ \begin{align} \text{maximize } & -\frac{1}{4}(\mu_2-\mu_1)^TW^{-1}(\mu_2-\mu_1)+s^T(\mu_2-\mu_1) -\lambda 1^T(\mu_2-\mu_1) - 2\lambda 1^T \mu_1\\ \text{subject to } & \mu_1\geq 0\\ & \mu_2 \geq 0 \end{align} $$
With the optimization variable transformation $\mu_2 = x + \mu_1$, the above can be written as (writing the objective as a minimization problem)
$$ \begin{align} \text{minimize } & \frac{1}{4}x^TW^{-1}x-s^Tx +\lambda \|x\|_1 + 2\lambda \| \mu_1\|_1\\ \text{subject to } & \mu_1\geq 0\\ & x+\mu_1 \geq 0 \end{align} $$
Now, notice that $\mu_1$ affects only the last term of the objective function, which is clearly minimized for $\mu_1 = 0$. This, in turn, simplifies the dual problem as
$$ \begin{align} \text{minimize } & \frac{1}{4}x^TW^{-1}x-s^Tx +\lambda \|x\|_1 \\ \text{subject to } & x\geq 0\\ \end{align} $$
On
I personally find it easiest to use conic formulations to get to the dual form. In this case, we have: \begin{array}{ll} \text{minimize} & y^TWy \\ \text{subject to} & (y - S_j, \lambda) \in \mathcal{K}_{\ell_1} \end{array} where $\mathcal{K}_{\ell_1}$ is the epigraph of the $\ell_1$ norm, a cone. The Lagrangian is $$L(y,x,s) = y^TWy - \langle (y - S_j, \lambda), (x, s) \rangle = y^TWy - (y-S_j)^T x - \lambda s$$ where $(x,s)$ lies in the dual cone $\mathcal{K}_{\ell_1}^*=\mathcal{K}_{\ell_\infty}$. The optimality condition on $y$ is $2Wy - x = 0$, or $y = \frac{1}{2} W^{-1} x$. So the dual function is $$g(x,s) = \frac{1}{4} x^TW^{-1} x - \frac{1}{2} x^TW^{-1} x + S_j^T x - \lambda s = - \frac{1}{4} x^TW^{-1} x + S_j^T x - \lambda s$$ The dual problem is $$\begin{array}{ll} \text{maximize} & g(x,s) = \inf_y L(y,x,s) \\ \text{subject to} & \|x\|_1 \leq s \end{array}$$ Substituting, we have $$\begin{array}{ll} \text{maximize} & - \frac{1}{4} x^TW^{-1} x + S_j^T x - \lambda s \\ \text{subject to} & \|x\|_1 \leq s \end{array}$$ Eliminating $s$ would give us the closest result to the one offered. It is not clear to me why the authors chose to express the negative of the dual; the proper dual is necessarily a maximization.
I believe the easiest way is directly using Lagrangian duality. Adding an auxiliary variable $z = y - S_j$, your problem becomes:
$$ \begin{aligned} \text{mininize} &\quad y^T W y \\ \text{subject to} &\quad ||z||_\infty \leq \lambda \\ &\quad z - y + S_j = 0 \end{aligned} $$
Using only the linear constraint for the Lagrangian, we have: $$ \begin{aligned} L(y, z; x) &= y^T W y + x^T(z - y + S_j) \\ &= y^T W y - x^T y + x^T S_j + x^T z \end{aligned} $$
Minimizing the Lagrangian above yields: $$ \begin{aligned} q(x) &= \inf_{y,z} \{ L(y, z; x) : ||z||_\infty \leq \lambda \} \\ &= \inf_y \{ y^T W y - x^T y \} + \inf_z \{ x^T z : ||z||_\infty \leq \lambda \} + x^T S_j \\ &= \inf_y \{ y^T W y - x^T y \} - \sup_z \{ -x^T z : ||z||_\infty \leq \lambda \} + x^T S_j \\ &= \inf_y \{ y^T W y - x^T y \} - \lambda \sup_z \{ (-x)^T (\tfrac{1}{\lambda} z) : ||\tfrac{1}{\lambda}z||_\infty \leq 1 \} + x^T S_j \end{aligned} $$ The infimum can be computed by comparing the gradient to zero, since it is a minimization of a convex quadratic function, and it yields $$-\frac{1}{4} x^T W^{-1} x$$ The supermum is precisely the definition of the dual norm, and thus it yields $$||-x||_1 = ||x||_1$$ So the dual problem is: $$ \text{maximize} \quad q(x) := -\frac{1}{4} x^T W^{-1} x - \lambda ||x||_1 + x^T S_j $$ When cast as a minimization problem, it is almost what you wanted, up to the constant $\tfrac{1}{4}$ before the quadratic term. Since in the paper they only want to say something about the structure of the problem , and thus the constant is not really important, I believe that this small technical error does not make any different with respect to their claim in the paper.