IN SVMs, why can't the Lagrange multipliers for support vectors be 0?

56 Views Asked by At

In hard margin SVMs, we have the primal optimization problem:

\begin{align*} \min_{\vec{w}, b} \max_{\vec{\alpha}} \quad & \frac{||\vec{w}||^2}{2} + \sum_{i=1}^m \alpha_i \left( 1 - y_i (\vec{w} \cdot \vec{x}_i + b) \right) \\ \text{s.t.} \quad & \alpha_i \ge 0, \quad i = 1, ..., m \end{align*}

As a result, we also have this condition:

$$\alpha_i \left(y_i (\vec{w} \cdot \vec{x}_i + b) - 1 \right) = 0$$

From this, I understand that if $\vec{x}_i$ is not a support vector, then we must have $\alpha_i = 0$. But must it be the case that $\alpha_i > 0$ for each support vector? By the conditions above, don't we really just have $\alpha_i \ge 0$ for each support vector?

I read an intuitive explanation (don't know if it's accurate) saying that we need $\alpha_i > 0$ for support vectors because this represents the importance of the support vectors in determining the margin. I agree it doesn't make intuitive sense if all $\alpha_i = 0$. But mathematically, how are we assured $\alpha_i > 0$ for all support vectors?

Despite lots of study, I'm confused. Thank you for your patience and any detailed explanations you can provide!