What is the dual problem of a typical $L_1$ norm?

1.9k Views Asked by At

I was reading the solutions of Pr. Boyd's optimisation book. In the solution set, for question 6.4 it is stated that the dual problem of the following problem:

$$\min {||Ax-b||_1} $$

is:

\begin{gather*} \max {\sum_{i=1}^m{b_i\lambda_i}} \\ \text{such that} \quad |\lambda_i| \le1, i=1, \ldots, m \\ \sum_{i=1}^m\lambda_i a_i = 0 \end{gather*}

The point is that I've tried to write the first problem as the second by employing conjugate, and I've verified the first and last lines of the second problem, but I really cannot figure out the second line of the second problem. I guess it may have a connection to the conjugate norm, but don't know how it may, and also I'm not sure. Does anyone have any opinion?

It's worth mentioning that the $\ell^1$ norm can easily be written using two dual $\lambda$s, but for this specific problem, another approach has been utilised using norm conjugate I believe.

3

There are 3 best solutions below

5
On BEST ANSWER

You can derive the dual via linear programming. Rewrite the primal problem as the following linear program, with dual variables in parentheses: \begin{align} &\text{minimize} &\sum_i z_i \\ &\text{subject to} & z_i - \sum_j A_{i,j} x_j &\ge b_i &&(\alpha_i \ge 0)\\ & & z_i + \sum_j A_{i,j} x_j &\ge -b_i &&(\beta_i \ge 0)\\ & & x_j &\quad\text{ free} \\ & & z_i &\ge 0 \end{align}

The dual problem is then: \begin{align} &\text{maximize} &\sum_i b_i \alpha_i - \sum_i b_i \beta_i \\ &\text{subject to} & -\sum_i A_{i,j} \alpha_i + \sum_i A_{i,j} \beta_i &= 0 &&(\text{$x_j$ free})\\ & & \alpha_i + \beta_i &\le 1 &&(z_i \ge 0)\\ & & \alpha_i &\ge 0 \\ & & \beta_i &\ge 0 \end{align}

Now let $\lambda_i = \alpha_i - \beta_i$. Because $\alpha_i \ge 0$ and $\beta_i \ge 0$, we have $|\lambda_i| \le \alpha_i + \beta_i$.

So the dual problem becomes: \begin{align} &\text{maximize} &\sum_i b_i \lambda_i \\ &\text{subject to} & \sum_i A_{i,j} \lambda_i &= 0 &&(\text{$x_j$ free})\\ & & |\lambda_i| &\le 1 &&(z_i \ge 0)\\ & & \lambda_i &\quad \text{ free} \end{align}


Alternatively, start with the dual problem as the linear program: \begin{align} &\text{maximize} &\sum_i b_i \lambda_i \\ &\text{subject to} & \sum_i A_{i,j} \lambda_i &= 0 &&(\text{$x_j$ free})\\ & & -\lambda_i &\le 1 &&(u_i \ge 0)\\ & & \lambda_i &\le 1 &&(v_i \ge 0)\\ & & \lambda_i &\quad \text{ free} \end{align}

The dual of the dual is: \begin{align} &\text{minimize} &\sum_i (u_i+v_i) \\ &\text{subject to} & \sum_j A_{i,j} x_j -u_i+v_i &= b_i &&(\text{$\lambda_i$ free})\\ & & x_j &\quad \text{ free} \\ & & u_i &\ge 0 \\ & & v_i &\ge 0 \end{align}

And this is the other well-known LP formulation for minimizing $||Ax-b||_1$.

0
On

Here's a solution that uses a Lagrangian based approach.

Starting from the original problem: \begin{align} &\text{minimize}& ||Ax-b||_1 \end{align}

We can reformulate this as a constrained optimization problem: \begin{align} &\text{minimize} & ||y||_1 \\ &\text{subject to} & Ax-b =y \end{align}

We can then form this problem's Lagrangian: $$L(x, y, \lambda)= ||y||_1 + \lambda^\top (y - Ax + b)$$

The Lagrange dual function is:

$$g(\lambda) = \inf_{x, y} L(x, y, \lambda)$$

The dual problem is to maximize $g(\lambda)$. For this problem, we can constrain $\lambda$ so that $g(\lambda) > -\infty$.

Note that if $\lambda^T A \neq 0$ then it is easy to see $g(\lambda) = -\infty$ (set $x = cA^\top \lambda$ for $c \in \mathbb{R}$ and let $c \to \infty$).

We claim if any $|\lambda_i| > 1$ then $g(\lambda) = -\infty$. To see this, set $y = ce_i$ for $c \in \mathbb{R}$ and $x = 0$. Then $L(x, y, \lambda) = |c| + \lambda_i c + \lambda^\top b$. If $\lambda_i > 1$, then $L(x, y, \lambda) \to -\infty$ as $c \to -\infty$ and if $\lambda_i < -1$ then $L(x, y, \lambda) \to -\infty$ as $c \to \infty$.

Thus, the dual problem is:

\begin{align} &\text{maximize} & g(\lambda) \\ &\text{subject to} & \lambda^\top A = 0 \\ & & |\lambda_i| \leq 1 \text{ for } i = 1, ..., m \end{align}

Finally, we claim $g(\lambda) = \lambda^\top b$ when $\lambda$ lies in the constraint set above. It is obvious that if $\lambda^\top A = 0$ then: $$L(x, y, \lambda) = ||y||_1 + \lambda^\top y + \lambda^\top b = \left[\sum_{i=1}^m |y_i| + \lambda_i y_i\right] + \lambda^\top b$$

For $i \in \{1, ..., m\}$, we have: \begin{align*} |\lambda_i| \leq 1 &\implies -|\lambda_i| \geq -1 \\ &\implies -|\lambda_i||y_i| \geq -|y_i| \\ &\implies |y_i|-|\lambda_i||y_i| \geq 0 \\ &\implies |y_i| + \lambda_i y_i \geq 0 \end{align*}

Thus $L(x, y, \lambda) \geq \lambda^\top b$ for all $x, y$ given that $\lambda$ lies in the constraint set. But this lower bound is achieved by setting $x = 0$ and $y= 0$. Hence, $g(\lambda) = \lambda^\top b$ for $\lambda$ in the constraint set.

The final dual problem is: \begin{align} &\text{maximize} & \lambda^\top b \\ &\text{subject to} & \lambda^\top A = 0 \\ & & |\lambda_i| \leq 1 \text{ for } i = 1, ..., m \end{align}

0
On

Here is the conjugate approach. The problem is:

$$\min_{x,y} \left\{ ||y||_1 : y = Ax-b \right\}$$ The Lagrangian is: $$L(x,y,\lambda) = ||y||_1+\lambda^T(y-Ax+b)$$ so the dual is \begin{align} & \max_{\lambda} \min_{x,y} \left\{ ||y||_1+\lambda^T(y-Ax+b) \right\} \\ = &\max_{\lambda} \left\{ b^T\lambda + \min_x (A^T\lambda)^Tx -\max_y (-\lambda)^Ty - ||y||_1 \right\} \\ = &\max_{\lambda} \left\{ b^T\lambda : A^T\lambda = 0, ||\lambda||_{\infty} \leq 1 \right\} \end{align} In the last step I used the conjugate of the $1$-norm. The function value of the conjugate is 0 if the dual norm (so the $\infty$-norm) of $-\lambda$ is at most $1$.