Different formulation for the soft-margin SVM and the question of convexity

75 Views Asked by At

I came upon 2 questions which I wasn't sure how to approach them analytically regarding the soft-margin SVM.

for the general problem which is stated as follows:

$$\min_{W, b, \zeta}\cfrac{1}{2}\|W\|^2+C \sum_{j=1}^{N}\zeta_{j} \text{ s.t. } y_{j}(Wx_{j}+b) \geq1-\zeta_{j} \text{ & } \zeta_{j}\geq0 \tag{*}$$

The first formulation is:

$$\min_{W, b, \eta}\cfrac{1}{2}\|W\|^2+C \sum_{j=1}^{N}\eta_{j}^{2} \text{ s.t. } y_{j}(Wx_{j}+b) \geq1-\eta_{j}^{2} \tag{**}$$

The second formulation is:

$$\min_{W, b, \psi}\cfrac{1}{2}\|W\|^2+C \sum_{j=1}^{N}\psi_{j}^{2} \text{ s.t. } y_{j}(Wx_{j}+b) \geq1-\psi_{j} \text{ & } \psi_{j}\geq0 \tag{***}$$

Now for the first formulation, the solution can be solved by standard optimization and we can obtain the same result of the general problem stated (*) with the weights and biases the same and $\zeta_{j}=\eta_{j}^{2}$ the problem is that (**) is not convex and one cannot apply the KKT condition.

My question regarding the first formulation is why isn't it a convex problem, and how could I see it analytically.

For the second formulation (***) it was stated the the problem is over constrained with respect to the general problem (*), and it cannot get lower target function. the solution can be solved by standard optimization and we can obtain the same result of the general problem stated (*) with the weights and biases the same and $\zeta_{j}=|\psi_{j}|$

My question regarding the second formulation is why is it true and how could I see it.

Note

I used optimization solver in Matlab to verify the above behaviour but could not derive it alone.

Appreciate some help, Thanks