How to convert SVM optimization problem into terms of quadratic programming?

352 Views Asked by At

I'm reading "The Elements of Statistical Learning", to be precise chapter about optimal separating hyperplanes (SVM classification problem) and I've stuck with the following. I have a function to be maximized:

$$ \sum_{i=1}^{N}a_i - \frac{1}{2}\sum_{i=1}^{N}\sum_{k=1}^{N}a_ia_ky_iy_kx_i^Tx_k, $$

subject to:

$$ 0 = \sum_{i=1}^{N}y_ia_i, \quad b = \sum_{i=1}^{N}a_iy_ix_i, \quad a_i \ge 0 \quad \forall{i}, \quad a_i[y_i(x_i^Tb + b_o) - 1] = 0 \quad \forall{i} $$

where $y$ and $x$ are known variables and $a$, $b$ and $b_0$ should be found.

And as far as I understand the next step should be to solve this problem using any of quadratic programming approach. In the same time quadratic programming problems looks like:

$$ \frac{1}{2}X^TQX+C^TX $$

subject to:

$$ Ax \le B $$

And I don't understand how to convert the first problem into the second. As far as I see we can transform $a_i \ge 0$ can be transformed into $Ax \le B$, but what about $a_i[y_i(x_i^Tb + b_o) - 1] = 0$? It looks like a quadratic constraint and cannot be transformed into linear constraint.

1

There are 1 best solutions below

0
On BEST ANSWER

You had an optimization problem with the following constraints $$ a_i[y_i(x_i^Tb+b_0)-1] \geq 0 \quad \forall 1 \leq i \leq N. $$ But using Kuhn-Tucker's theorem, you formed an equivalent dual problem, that can be solved via quadratic programming. And in this problem the specified condition is redundant. All you need is complementary slackness and non-negativity conditions: $$ \sum_{i=1}^N a_i y_i \geq 0 \text{ and } a_i \geq 0 \ \forall 1 \leq i \leq N. $$