Geometric interpretation of support vector values in primal space

310 Views Asked by At

The Linear Support Vector Machine classification ($y_{k} = -1\ \mathrm{or}\ +1$) with misclassification tolerance loss function in primal weight space looks like this:

$$\min\limits_{w,b,\xi} J_{P}(w,\xi) = \frac{1}{2}w^{T}w + c\sum\limits_{k=1}^{N}\xi_{k}$$

Subject to conditions: $$\forall_{k\in1...N} \ \ \xi_{k} \geq 0$$ $$\forall_{k\in1...N} \ \ y_{k}(w^{T}x_{k}+b) \geq 1 - \xi_{k}$$

In dual space it becomes:

$$\max\limits_{\alpha} J_{D}(\alpha) = -\frac{1}{2}\sum\limits_{k,l=1}^{N} y_{k}y_{l}x_{k}^{T}x_{l}\alpha_{k}\alpha_{l}+\sum\limits_{k=1}^{N}\alpha_{k} $$

Subject to conditions: $$\sum\limits_{k=1}^{N}\alpha_{k}y_{k} = 0$$ $$\forall_{k=1...N}\ \ 0 \leq \alpha_{k} \leq c$$

My geometric intepretation of these values:

enter image description here

So I can say that (correct me if I'm wrong):

$w^{T}x+b = 0$ is the decision boundary line.

$w^{T}x+b= -1, 1$ are the margins for respective classes.

$\xi_{k}$ (slack variables) are distances from the margin of correct classification for $k$ data point.

My question is:

Are there geometric interpretations of $c$ and $\alpha$ values which can be visualised on the above pictorial interpretation as well? If so, what are they?

1

There are 1 best solutions below

0
On BEST ANSWER

Your first three statements are correct.

Your picture is almost correct. Qualitatively, it is correct, however, your drawing assumes that $\|w\| = 1$ which is not true in general. If you substituted the distance of $1$ with $\frac{1}{\|w\|}$ the drawing would be correct. Regarding the geometry of $c$ and $\alpha$..

$c$ just affects the mixture of regularization with margin violation. Geometrically, therefore it affects the distance between the two margins (i.e. smaller $c$ increases the distance between the margins because it makes $\|w\|$ smaller).

Because $\alpha$ exists in the dual problem, I am not aware of an interpretation in the primal problem other than the the KKT conditions:

$$(\alpha_k = 0) \implies y_{k}(w^{T}x_{k}+b) \gt 1$$ $$(0 \lt \alpha_k \lt c) \implies y_{k}(w^{T}x_{k}+b) = 1$$ $$(\alpha_k = c) \implies y_{k}(w^{T}x_{k}+b) \lt 1$$