Implicit normalization factor in hard margin SVM derivation

38 Views Asked by Bumbble Comm At 08 Apr 2026 - 3:56

I'm going through the derivation of the hard margin SVM and I'm a little confused as to why there's a $1$ in the constraint as opposed to a $0$.

Consider the canonical form of the hard margin SVM constraint

$y_i (w^Tx_i - b) \geq 1, \forall i$

where $x_i$ is the training data, the label is $y_i \in \{ +1, -1\}$ and $w$, $b$ are parameters of a hyperplane.

I'm confused about where the $\geq 1$ comes from, because we could've achieved a similar result with

$y_i (w^Tx_i - b) \geq 0, \forall I$

similar to a classic perceptron.

After some reading, I'v found a few reasons but I don't quite understand the intuition behind them.

From this SE article, the constraint was actually a new variable $\gamma$ which determines the size of the margin, but we can just divide both sides by $\gamma$ and still solve for the same hyperplane. This explanation makes algebraic sense but it doesn't make geometric sense to me.
Without the $\geq 1$ inequality, the optimizer can solve for any $(\alpha w, \alpha b)$ scale factor of the parameters of the hyperplane and still yield the same hyperplane, which can lead to instability. Therefore, we want to put a constraint on $||w||_2$, and somehow setting the inequality to $1$ accomplishes this.

I like the reasoning behind (2) a lot more because it introduces a real problem of having infinitely many solutions and solves it by somehow constraining it to have one unique solution (should one exist), but I don't understand the math behind it.

Thanks for any clarifications!

Original Q&A

Implicit normalization factor in hard margin SVM derivation

Related Questions in LINEAR-ALGEBRA

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions