How to prove the optimality of $r = \frac{Ax^*-b}{\|Ax^*-b\|_2}$ for the dual problem of $\text{minimize}\quad \|Ax-b\|_2 + \gamma\|x\|_1$?

139 Views Asked by At

When I'm doing convex optimization homework, the teacher wants me to show the following problem:(I believe this should be done using duality)

Consider the optimization problem: $$\text{minimize}\quad \|Ax-b\|_2 + \gamma\|x\|_1$$ where $A\in\mathbb{R}^{m\times n}$,$b\in \mathbb{R}^m$ and $\gamma>0$. If $x^*$ is an optimal point, and $Ax^*-b\neq0$, define : $r = \frac{Ax^*-b}{\|Ax^*-b\|_2}$. $$\text{Show that:}\quad\|A^Tr\|_\infty\leq \gamma \quad r^TAx^*+\gamma\|x^*\|_1=0$$

However, I don't know how the inequality and equality I need to show are related to the optimization problem... I tried to derive the dual problem of this optimization problem:

First I change it into: $$\text{minimize}\quad \|y\|_2 + \gamma\|x\|_1$$ $$\text{subject to}\quad Ax-b=y$$

And I derive it's dual problem: $$\text{maxmize}\quad -v^Tb$$ $$\text{subject to}\quad \|A^Tv\|_\infty\leq \gamma\quad \|v\|_2\leq 1$$

I can see that, if $r$ is the optimal point of the dual problem, then by the strong duality, we can show that:$-r^Tb=\|Ax^*-b\|_2+\gamma\|x^*\|_1$,which is just $r^TAx^*+\gamma\|x^*\|_1=0$. But I don't know how to show that $r$ is the optimal point of the dual problem...Any hints on how can I prove this? Thanks!

1

There are 1 best solutions below

0
On

I found some free time to try and solve this. Let $e=(1,\ldots,1)^T$ and rewrite the problem in the following equivalent SOCP form: \begin{align} \text{minimize}\quad& y + \gamma e^Tz,\\ \text{subject to}\quad& \|Ax-b\|_2^2/2\leq y^2/2,\\ & x-z\leq 0,\\ & -x-z \leq 0, \end{align} where $a\leq b \iff b-a\in\mathbb{R}^n_+$. Note that Slater's condition holds for this problem: take $x = 0, y = \alpha \|b\|, z = \beta e, \alpha > 1, \beta > 0,$ thus strong duality holds. It is also clearly bounded, so it is dual-feasible.

Write the Lagrangian as: $$ L(x,y,z,\lambda,\mu,\nu)=y + \gamma e^Tz + \lambda\left(\|Ax-b\|_2^2 - y^2\right)/2+\sum_{i=1}^n\mu_i(x_i-z_i)-\sum_{i=1}^n\nu_i(x_i+z_i), $$ and write down the KKT conditions that our optimal point $(x,y,z)$ satisfies: \begin{align} \nabla_xL =\lambda A^T(Ax-b)+\mu-\nu &=0\\ \nabla_yL =1-\lambda y&=0\\ \nabla_zL =\gamma e-\mu-\nu&=0\\ \lambda\left(\|Ax-b\|_2^2 - y^2\right)/2+\sum_{i=1}^n\mu_i(x_i-z_i)-\sum_{i=1}^n\nu_i(x_i+z_i)&=0\\ \begin{pmatrix}I&-I\\-I&-I\end{pmatrix}\begin{pmatrix}x\\z\end{pmatrix}&\leq 0\\ \|Ax-b\|_2^2&\leq y^2\\ \lambda,\mu,\nu &\geq 0. \end{align} Notice that from the second equality, $y = \lambda^{-1} > 0$, and so by complementary slackness, we must have $y = \|Ax - b\|_2$ so $\lambda = \|Ax-b\|_2^{-1}$. We can eliminate $\nu$ using the third equality:$\nu =\gamma e - \mu,$ thus from the first equality $$ \lambda A^T(Ax-b) + 2\mu - \gamma e = A^Tr + 2\mu - \gamma e \implies \mu = \frac{1}{2}(\gamma e - A^Tr) \geq 0\implies \gamma e \geq A^Tr, $$ and also $$ \nu = \frac{1}{2}(\gamma e + A^Tr)\geq 0 \implies A^Tr\geq -\gamma e. $$ and these two inequalities are equivalent to $\|A^Tr\|_\infty \leq \gamma$.

Also, the fourth equality from the KKT conditions becomes: $$ (\mu-\nu)^T x - (\mu+\nu)^Tz = -r^TAx - \gamma e^Tz = 0. $$ which is equivalent to $$ r^TAx + \gamma \|x\|_1 = 0, $$ after setting $z_i=\operatorname{sign}(x_i)x_i,i=1,\ldots,n$.