how do we know these hyperplanes really seperate the data??

136 Views Asked by At

In the math behind Support Vector Machine :

"Given a hyperplane H0 separating the dataset and satisfying:"

w⋅x + b= 0

"We can select two others hyperplanes H1 and H2 which also separate the data and have the following equations :"

w⋅x + b = δ

and

w⋅x + b = −δ

how are we sure that these two hyperplanes also separate the dataset ?

Also: " here the variable δ is not necessary. So we can set δ=1 to simplify the problem."

w⋅x + b = 1

and

w⋅x + b= −1

I'm also here not sure why the variable δ is not necessary, also why we choosed the the value 1 for it???

thanks!

1

There are 1 best solutions below

0
On

The fact that having a separating hyperplane $$ \langle w,x\rangle+b=0 $$ separates the points means that $$ \langle w,x_i\rangle+b>0 $$ for sample points $x_i$ with positive label $+1$ and $\langle w,x_i\rangle+b<0$ for negatively labeled points. These two inequalities are strict and you can replace $0$ by some $\delta_i>0$. Choose $\delta_1=\min\{\delta_i\}$ and $\langle w,x_i\rangle+b>\delta_1$ for all sample points $x_i$ with positive label. In the same way, you can achieve $\langle w,x_i\rangle+b<-\delta_2$ for sample points $x_i$ with negative label, for some $\delta_2>0$. Choosing $\delta=\min(\delta_1,\delta_2)$ you get what you need.

If you divide throughout by $\delta$, you can replace $\delta$ by $1$ if you replace $w$ by $w/\delta$ and $b$ by $b/\delta$ (this does not change the separating hyperplane).

In one sentence: the quantities $\langle w,x_i\rangle+b$ are functional distances to the hyperplane, and you can rescale them any way you want.