in the derivation of support vector classifier

84 Views Asked by At

I was studying the Stephen Boyd's textbook on convex optimization and have a question on the support vector classifier. The book says the following:


In linear discrimination, we seek an affine function $f(x) = a^T x - b$ that classifiers the points, i.e.,

$a^T x_i - b > 0, i = 1,..., N$

$a^T y_i - b < 0, i = 1, ..., M$

Geometrically, we seek a hyperplane that separates the two sets of points. Since the strict inequalities are homogeneous in a and b, they are feasible if and only if the set of nonstrict linear inequalities

$a^T x_i - b \geq 1, i = 1,..., N$

$a^T y_i - b \leq -1, i = 1, ..., M$

(in the variables a,b) is feasible.


First, what does it mean that the original inequalities are homogeneous in a and b? Second, I have trouble in understanding the if and only if relationship. "If" part is obvious: if a >= 1, a > 0 obviously. However, can anybody explain to me why the "only if" part also holds?

Thanks,

1

There are 1 best solutions below

0
On

The inequalities are are homogeneous in $a$ and $b$ because you can multiple $a$ and $b$ by any $\alpha >0$ and the inequalities will still hold. In other words, you can always "scale" $a$ and $b$ up or down as you like.

The author is being a little sloppy here and this sloppiness is confusing you. Here's a more formal way of stating what the author wishes to say:

There are $a$ and $b$ such that $a^T x_i - b > 0$ and $a^T y_j - b < 0$ for all $i\in\{1,\ldots,N\}$ and $j\in\{1,\ldots,M\}$ if and only if there are $a$ and $b$ such that $a^T x_i - b \ge 1$ and $a^T y_j - b \le -1$ for all $i\in\{1,\ldots,N\}$ and $j\in\{1,\ldots,M\}$.

In other words, the $a,b$ in the first inequalities are different from the $a,b$ in the second inequalities.

Given that you can scale $a$ and $b$ up by an arbitrarily large positive constant, the statement seems obvious. Please let me know if not and I'll supply a proof.