I'm trying to optimize the following equation:
$$min_y \frac{1}{2}||y-z||^2_2 \\ s.t \ (\mathbf{y} - \mathbf{\sigma})^T\mathbf{A}(\mathbf{y}-\mathbf{\sigma}) \leq 1$$
Note: $A ⪰ 0$
I've started out by writing the Lagrangian
$ L(y, \lambda)= (y-z) + \lambda ( (y-\sigma)^T A (y-\sigma) - 1)$
Now to get the partial derivative, I'm not sure how to proceed for the second term; can anyone help me out here? Is there a good
$\frac{\partial L}{\partial y} = (y - z) \ + \ 2\lambda A(y-\sigma) = 0 $
First question: How do you simplify the expression above to get it in form of y = ... where I'm not sure how to work with matrices (i.e. in the case of A how do I move it to the other side? Since you can't really divide with matrices here).
I believe the second step for the KKT is to check whether $ \lambda .((y - \sigma)^TA(y-\sigma) - 1) = 0$
In the case $\lambda \neq 0$ we have $ (y - \sigma)^TA(y-\sigma) = 1$ which I'm guessing I replace whatever variable I get in the above and then run an iterative algorithm (any hint on what might work quickly would be appreciated)
In addition to the above, can someone please be kind enough to provide a high level intuition of what's going on and how the optimization procedure is working? I'm at a complete loss.
Let $L = |y - z|^2/2 + \mu g(y)$, where $g(y) = (y - \sigma)^T A (y - \sigma) - 1 \le 0$.
Then, KKT requires $\partial L/\partial y = (y - z) + 2\mu A (y - \sigma) = 0$ with $\mu g(y) = 0$ and $\mu \ge 0$.
First, rearranging the derivative yields $(I + 2 \mu A)y = z + 2 \mu A \sigma$. Since $A$ is positive semidefinite and $\mu \ge 0$, $I + 2\mu A$ is positive definite; hence invertible. So we obtain $$ y = (I + 2 \mu A)^{-1}(z + 2 \mu A \sigma) \quad\cdots\quad (1) $$
Now, we see the two cases: $g(y) = 0$ and $g(y) < 0$.
If $g(y) < 0$, $\mu$ must be zero to meet the condition, which means $y = z$ from (1). The minimum, in this case, is $0$. We can see from this result that the point $z$ must satisfy $g(z) < 0$ for the problem to make sense. Otherwise, the problem doesn't have a feasible solution, or it is not the case that $g(y) < 0$.
If $g(y) = 0$ (we say that the constraint is active in this case), $\mu \ge 0$ has to be determined from the equation obtained by substituting (1) to $g(y) = 0$, and this amounts to solving an optimization problem with an equality constraint. Geometrically, it is equivalent to finding the radius of the smallest closed ball centered at $z$ that touches $g(y) = 0$. I guess this should be sought by using a numerical approach for general $A$, $\sigma$, and $z$.
Thank you.