Clarification about solution of linear SVM problem

157 Views Asked by Bumbble Comm At 29 Mar 2026 - 2:57

I'm reading this tutorial about SVMs.

I'd like to have two clarifications:

at page 4 (bottom), why is that, after using (1.10) the summation is extended to only $m \in S$? In (1.10) the summation applies to all elements of $L$, and I don't think ${\mathbf x}_m\cdot {\mathbf x}_s = 0$ is necessarily true.
page 5: why is taking the average better? Isn't the value of $b$ supposed to be unique?

There are 1 best solutions below

Bumbble Comm On 15 Jul 2014 - 3:37

$1$. The reason is that for any non-support vector ${\mathbf x}_i$, we will have $\alpha_i = 0$. This is the way Lagrange multipliers work for inequality contraints such as we do here. Only the vectors for which the inequality becomes an equality, which are support vectors by definition, have a non-zero Lagrange multiplier. A constraint with a zero Lagrange multipliers implies that it's not "active" in the optimal solution, or not making any difference to it, which is as you'd expect here for vectors not on hyperplane $H_1$ or $H_2$.

For examples of this: Kuhn-Tucker examples

$2$. I think in nearly all cases the value of $b$ will be the same for all the support vectors. I think the averaging idea is just a safeguard against anomalies in the numerical calculation of $b$. I agree with you that in theory, all the $b$'s should have the same value. See Page 565 of these notes where there is an example of such calculations. There is a small variation in the numerical values of $b$ for each support vector. Only tiny in this example, but one might be able to construct cases where the discrepencies become a problem. Taking the average minimises the risk of inadvertently striking such cases.

Clarification about solution of linear SVM problem

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions