Show equivalence of minimization techniques using Lagrangian multiplier.

56 Views Asked by At

Define a cost function :

$$L(x,w) = \frac{1}{2} \sum_{i=1}^N\left(y^{(i)} - w^T\phi\left(x^{(i)}\right)\right)^2 + \frac{\lambda}{2}\sum_{j=1}^M w_j^2$$

Show that minimizing $L$ is equivalent to minimizing

$$\hat L(x,w) = \frac{1}{2} \sum_{i=1}^N\left(y^{(i)} - w^T\phi(x^{(i)}\right)^2$$

subject to the constraint

$$\sum_{j=1}^M w_j^2 \leq e$$

What is the significance of $e$?


So my poor understanding Lagrange multipliers leads to do the following:

Set

$$ g(x, w) = \sum_{j=1}^M w_j^2 - e $$

so

$$\mathcal{L}(x,w,\lambda) = \hat L(x,w) = \frac{1}{2} \sum_{i=1}^N\left(y^{(i)} - w^T\phi(x^{(i)}\right)^2 + \lambda \left(\sum_{j=1}^M w_j^2 - e\right)$$

So $\mathcal{L}$ appears very similar to $L$. I think this is where the $e$ comes into play. If the two functions are to be identical (which I don't think is the point), then

$$e = \sum_{j=1}^M \frac{w_j^2}{2}$$

but looking at the original constraint

$$\sum_{j=1}^M w_j^2 \leq e \implies \sum_{j=1}^M w_j^2 - \sum_{j=1}^M \frac{w_j^2}{2} \leq 0$$

which is clearly violated, unless $w = 0$, and assuming we're dealing with real numbers.

I just don't see why $e$ is so special. Nor do I see what the purpose of the exercise is. Please help me see what I'm supposed to.

1

There are 1 best solutions below

0
On

You basically have it. The question is asking about showing that imposing a constraint on $\vert \vert w \vert \vert_2^2$ is equivalent to penalizing $\vert \vert w \vert \vert_2^2$ in the objective. This means that you need to show that for each $\lambda \geq 0$ there is some $e \geq 0$ such that the two optimization problems return the same optimal $w$'s. It doesn't mean that the two functions are identical

In fact, if we assign a price of $\mu$ to the second constraint then the second function would have a value $-\mu e$ cheaper, since putting a constraint in the objective means writing "$+\mu(w^\top w-e)$"