optimization of the multiplication of two variables

792 Views Asked by At

I have just started learning optimization because I need it in one my projects but I'm totally naive in this field. I was wondering if there is any method to solve a minimization problem of the form:

$$\min_{X,Y}{\left\lVert Z-AXY \right\rVert}_2 ^2 +\alpha \left\lVert X \right\rVert_1 +\beta \alpha \left\lVert Y \right\rVert_1 $$

$Z$ and $A$ are given and they are matrices that I have considered the vectorized format for this optimization problem. I know that if I want to use ADMM I need to change the problem into a summation maybe take logarithm but I'm not sure if that would make any sense?

From somewhere I got the idea of freezing one variable at a time and solve for the other one but I'm not sure if that would work here. Please let me know if it solvable and please mention the methods considering the convexity and non-convexity of the objective function (I want to know both).

Thanks

1

There are 1 best solutions below

0
On

Let's define two additional variables: $C=X$ and $D=Y$

then we reformulate the original problem as follows

$$\min_{X,Y,C,D} \|Z-AXY\|_2^2 + \alpha\|C\|_1 + \alpha\beta\|D\|_1, \qquad \text{s.t.} C-X=0, \, D-Y=0$$

The augmented Lagrangian multiplier function is

$$\min_{X,Y,C,D} \|Z-AXY\|_2^2 + \alpha\|C\|_1 + \alpha\beta\|D\|_1 + \langle\ \mu_1,C-X\rangle + \frac{\rho_1}{2}\|C-X\|^2_2 + \langle\ \mu_2,D-Y\rangle + \frac{\rho_2}{2}\|D-Y\|^2_2$$

where $\mu_1, \mu_2$ are Lagrang2 multipliers and $\rho_1, \rho_2$ are penalty parameters.

We can now define the ADMM iterations as follows

$$\begin{align*} X^+ &= arg\min_{X} \|Z-AXY\|_2^2 + \langle\ \mu_1,C-X\rangle + \frac{\rho_1}{2}\|C-X\|^2_2 & \qquad (1)\\ Y^+ &= arg\min_{Y} \|Z-AXY\|_2^2 + \langle\ \mu_2,D-Y\rangle + \frac{\rho_2}{2}\|D-Y\|^2_2 & \qquad (2)\\ C^+ &= arg\min_{C} \alpha\|C\|_1 + \frac{\rho_1}{2}\|C-(X-\frac{\mu_1}{\rho_1})\|^2_2 & \qquad (3)\\ D^+ &= arg\min_{D} \alpha\beta\|D\|_1 + \frac{\rho_2}{2}\|D-(Y-\frac{\mu_2}{\rho_2})\|^2_2 & \qquad (4)\\ \mu_1^+ &= \mu_1 + \rho_1(C-X) & \qquad (5)\\ \mu_2^+ &= \mu_2 + \rho_2(D-Y) & \qquad (6) \end{align*}$$

For $X$ and $Y$ variables, we take the derivative of (1) and (2) w.r.t X, Y respectively and then we solve these expressions to get the update formulas.

For $C$ and $D$ variables, we use the proximal operator of the $L_1-$norm to obtain a closed-form solution for (3) and (4), which is given by:

$$C^+=sign(Y-\frac{\mu_2}{\rho_2})\odot \max(\frac{\alpha}{\rho_1}(Y-\frac{\mu_2}{\rho_2}), 0), \qquad \text{(1-dimentional shrinkage operator)}$$

The same thing for the $D$ variable.