The dual of an optimal problem subject to infinity norm of a matrix

36 Views Asked by At

The optimal problem comes from paper <Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data>, section 3.1 is $$ \hat{y}:= \arg\min_y \{ y^T W^{-1} y: ||y - s ||_{\infty} \leq \lambda \tag{1} $$ where $y, s \in \mathbb{R}^{n \times 1}$, $W \in \mathbb{R}^{n \times n}$, $\lambda$ is a scalar. $||A||_{\infty}$ denote the maximum abosolute value element in $A$.

Then, section 3.3 given the dual of the above problem is $$ \min_x x^T W x - s^T x + \lambda ||x||_1 $$ and withou any computational details. Can someone explaine how this result obatined? or I have done some calculation as follows

  1. The new objective function of the optimal problem by Lagrangian method is $$ \begin{align*} \Lambda (y, x) &= y^T W^{-1} y + \sum_i x_i (|y_i - s_i| - \lambda) \\ &= y^T W^{-1} y + x^T |y-s| - \lambda ||x||_1 \end{align*} $$

  2. Let $\hat{y}$ is obtained by $\frac{\partial \Lambda(y, x)}{\partial y} = 0$. $$ \begin{align*} \frac{\partial \Lambda(y, x)}{\partial y} = 2W^{-1}y + \Gamma(y-s) \circ x \end{align*} $$ $A \circ B$ denote Hadamard product (element-wise multiply), and $$ {\Gamma(y-s)}_i = \gamma_i = \begin{cases} 1 \quad & y_i > s_i \\ -1 \quad & y_i < s_i \\ -1 < \alpha < 1 \quad & y_i = s_i \end{cases} $$ Then $\hat{y} = -\frac{1}{2} W \cdot (\gamma \circ x)$

  3. The dual of (1) is $$ \begin{align*} \max_x \min_y \Lambda(y, x) &= \max_x \min_y y^T W^{-1} y + x^T |y-s| - \lambda ||x||_1 \\ &= \max_x \hat{y}^T W^{-1} \hat{y} + x^T |\hat{y}-s| - \lambda ||x||_1 \\ &= \frac{1}{4} (\gamma \circ x)^T W (\gamma \circ x) + \frac{1}{2} x^T | W (\gamma \circ x) - s| - \lambda ||x||_1 \\ \text{(it should be)} &= \min_x x^T W x - s^T x + \lambda ||x||_1 \end{align*} $$

However I cannot get the equivalence of the result of paper and my calculation, what mistakes I have made and how to get the right results? Thanks advance.