I was studying the Stephen Boyd's textbook on convex optimization and have a question on the support vector classifier. The book says the following:
In linear discrimination, we seek an affine function $f(x) = a^T x - b$ that classifiers the points, i.e.,
$a^T x_i - b > 0, i = 1,..., N$
$a^T y_i - b < 0, i = 1, ..., M$
Geometrically, we seek a hyperplane that separates the two sets of points. Since the strict inequalities are homogeneous in a and b, they are feasible if and only if the set of nonstrict linear inequalities
$a^T x_i - b \geq 1, i = 1,..., N$
$a^T y_i - b \leq -1, i = 1, ..., M$
(in the variables a,b) is feasible.
First, what does it mean that the original inequalities are homogeneous in a and b? Second, I have trouble in understanding the if and only if relationship. "If" part is obvious: if a >= 1, a > 0 obviously. However, can anybody explain to me why the "only if" part also holds?
Thanks,
The inequalities are are homogeneous in $a$ and $b$ because you can multiple $a$ and $b$ by any $\alpha >0$ and the inequalities will still hold. In other words, you can always "scale" $a$ and $b$ up or down as you like.
The author is being a little sloppy here and this sloppiness is confusing you. Here's a more formal way of stating what the author wishes to say:
There are $a$ and $b$ such that $a^T x_i - b > 0$ and $a^T y_j - b < 0$ for all $i\in\{1,\ldots,N\}$ and $j\in\{1,\ldots,M\}$ if and only if there are $a$ and $b$ such that $a^T x_i - b \ge 1$ and $a^T y_j - b \le -1$ for all $i\in\{1,\ldots,N\}$ and $j\in\{1,\ldots,M\}$.
In other words, the $a,b$ in the first inequalities are different from the $a,b$ in the second inequalities.
Given that you can scale $a$ and $b$ up by an arbitrarily large positive constant, the statement seems obvious. Please let me know if not and I'll supply a proof.