Solving constraint regression problem using Lagrangian

118 Views Asked by At

As part of an exam some weeks ago, I had the following problem:

Find a minimizer for:

$ \min_{x \in \mathbb{R}^D} || Ax - y||_2 \text{ subject to } || x ||_2 \leq t, $

where $ t > 0, A \in \mathbb{R}^{k \times d} $. Use the Lagrangian $ L(x, \nu) = || Ax - y||_2^2 + \nu (||x||_2^2 - t^2) $

I took the gradient of the Lagrangian function w.r.t $ x $ and set it to $ 0 $, like this:

$$ \nabla_x L(x, \nu) = \nabla_x [ (Ax-y)^T (Ax-y) + \nu (x^T x - t^2)] = \nabla_x [x^T A^T A x - 2 x^T A^T y + y^T y - \nu (x^T x - t^2)] = 2 A^T A x - 2 A^T y - 2 \nu x \overset{!}{=} 0 \iff x = (A^T A + \nu I)^{-1} A^T y, $$ where $ I $ is the identity matrix and we assumed that the inverse exists.

Following the 'classical' steps for solving constraint optimization problems, the next step would have been to take this $ x $ and replace it in the Lagrangian in order to obtain the dual, then take the derivative of the dual with respect to $ \nu $ and set it to $ 0 $. I was short on time and after noticing that $ \nu $ appears under the inverse and it wasn't straightforward to compute its derivative, I skipped the rest of the exercise.

Now, solutions are out and what we got is the following: set the gradient of the Lagrangian w.r.t $ x $ to zero and obtain $ x(\nu) $ just like I did above. Then set the derivative of the Lagrangian w.r.t $ \nu $ to zero, i.e.:

$$ (*) \frac{\partial L(x, \nu)}{\partial \nu} = || x ||_2^2 - t^2 \overset{!}{=} 0 \iff || x ||_2^2 = t^2 $$ and finally the last step was "choose $ \nu $ such that $ || x ||_2 = || (A^T A + \nu I)^{-1} A^T y ||_2 = t $ "

I would say that $ (*) $ is incorrect, but I am not completely sure. Why would it make sense to set the derivative of the Lagrangian w.r.t the multiplier to zero ? If we look at the KKT conditions, from complementary slackness we know that $ \nu (|| x||_2^2 - t^2) = 0 $, but this means that we could have $ \nu = 0 $ and $ ||x ||_2 < t $ and not necessarily $ || x ||_2 = t $

Is the provided solution a correct one? If yes, why?