Looking to add L1 regularization to a quadratic minimization

118 Views Asked by At

I'm hoping to minimize $\| y - Ax \|^2$ subject to a total variation constraint on the derivative of $x$ where $y$ is given. I'm hoping to use CVXOPT. I think I can set this up as follows:

$$ P = \min_x \frac{1}{2} \|y - Ax \|^2 + \epsilon |D x |_1 $$ aka $$ P = \max_\mu \min_x \frac{1}{2} \|y - Ax \|^2 + \epsilon |z|_1 + \mu^t(Dx - z) $$ which I break up as max of $P_1(\mu) + P_2(\mu)$ with $$ P_1 (\mu) = \min_x \frac{1}{2} \|y - Ax \|^2 + \mu^t Dx \quad \quad P_2(\mu) = \min_z \epsilon |z|_1 - \mu^t z $$ The latter I see is $-\infty$ unless $| \mu |_\infty \le \epsilon$ in which case it is zero, but $P_1$ is slowing me down....

Is the following logic correct?

If $Ax = 0$ then I could make $\mu^t D x$ as negative as I wanted unless I could force that expression to be zero, i.e. that $\mu^t D$ is in kernel of $A$, i.e. $D^t \mu = A^t b$ for some $b$. If that constraint is satisfied then $$ P_1 ( \mu ) = \frac{1}{2} (y^t - x^t A^t)(y - A x) + b^t A x $$ which I differentiate to get $- y^t A + x^t A^t A + b^t A = 0$ and then wave hand in air a bit to say that must be the same as $Ax = y - b$. [Is that ok? I can assume $A A^t$ is invertible if it helps..] So the minimum becomes $$ P_1 ( \mu ) = \frac{1}{2} y^t y - y^t (y-b) + \frac{1}{2} (y^t - b^t)(y-b) + b^t (y-b) = b^t y - \frac{1}{2} \| b \|^2 $$

Lastly, for the derivative operator, then $D D^t$ is invertible so $\mu = (D D^t)^{-1} D A^t b$ to give what I think is the final configuration:

$$ P = \max_\mu b^t y - \frac{1}{2} \| b \|^2 \quad \text {where} \quad \| (D D^t)^{-1} D A^t b \|_\infty \le \epsilon $$ which looks like something I can pass to CVXOPT. Does this make sense? That gets me $b$ and so $\mu$. How do I get from $b$ to $x$?

p.s. I see something in the CVXOPT about making special matrices to invert customized KKT matrix, or something, to speed things up, but I'm not there yet. I want to get the "basic" configuration working first.

1

There are 1 best solutions below

0
On

So as noted in my comment. The best way to proceed is to add another unknown $z$ of dimension $n-1$ then minimize $1/2∥y−Ax∥^2+ϵ1^t z$ subject to component-wise $−z ≤ Dx ≤ z$

This seems like it might be crazy inefficient but the corresponding KKT matrix has a very simple structure with lots of diagonals and tridiagonals so it can be inverted very quickly.