How to derive the bias term in Regularized Support Vector Machine?

83 Views Asked by Bumbble Comm At 11 May 2026 - 3:47

My question is: How to derive $ b^* $ (optimal $ b $) when all $ \alpha_{i} $ is $ 0 $ or $ C $?

For a given SVM primal problem:

$$ \text{minimize } \frac{1}{2}w^{T}w + C\sum_{i=1}^{l}\xi_{i}$$ $$ \text{subject to } y_{i}(w^{T}\phi(x_{i}) + b) \ge 1 - \xi_{i} $$ $$ \xi_{i} \ge 0 $$

I already know how to derive the dual problem:

$$ \text{min } \frac{1}{2}\alpha^{T}Q\alpha - \mathbf{1}^{T}\alpha$$ $$ \text{subject to } 0 \le \alpha_{i} \le C$$ $$ \mathbf{y}^{T}\alpha = 0 $$ $$ \text{where } Q_{ij} = y_{i} y_{j} \phi(x_{i})^{T} \phi(x_{j}) $$ And I know by KKT condition, we have: $$ \alpha_{i}(1 - \xi_{i} - y_{i}(w^{T}\phi(x_i) + b)) = 0 $$ $$ \beta_{i}(-\xi_i) = 0 $$ $$ \nabla_{w} L(w, b, \xi, \alpha, \beta) = 0 \Rightarrow w = \sum_{i=1}^{l} \alpha_{i} y_{i} \phi(x_{i}) $$ $$ \nabla_{b} L(w, b, \xi, \alpha, \beta) = 0 \Rightarrow \sum_{i=1}^{l}\alpha_{i} y_{i} = 0 $$ $$ \nabla_{\xi} L(w, b, \xi, \alpha, \beta) = 0 \Rightarrow C = \alpha_{i} + \beta_{i}$$

I just saw this paper's 13th and 14th page, but it said just sample an $ 0 \lt \alpha_{i} \lt C $ so we can find $ b $ by $$ \alpha_{i}(1 - \xi_{i} - y_{i}(w^{T}\phi(x_i) + b)) = 0 $$ $$ \beta_{i}(-\xi_i) = 0 $$ because $ \xi_i = 0 $ and $ 1 - \xi_{i} - y_{i}(w^{T}\phi(x_i) + b) = 0 $

But, what if $\alpha_{i} $ is $ 0 $ or $ C $?

Note: I took the notation in this slide

Original Q&A

There are 1 best solutions below

Bumbble Comm On 29 May 2018 - 5:03

In order to determine the bias $b$ you use

$$Y(\boldsymbol{x}) = \boldsymbol{w}^T\boldsymbol{\phi}(\boldsymbol{x})+b$$

for the support vectors for which $0< \alpha_i<C$ and $\xi_i=0$, because they lie on the boundary of the margin.

Then you will have to solve

$$y_i\left[\sum_{m\in S}\alpha_my_m\boldsymbol{\phi}^T(\boldsymbol{x}_i)\boldsymbol{\phi}(\boldsymbol{x}_m)+b\right]=1,$$

in which $S$ is the set of all indices of the support vectors for the bias $b$.

In Pattern Recognition and Machine Learning by Bishop, the author also argues that it is better to average this equation over the support vectors.

How to derive the bias term in Regularized Support Vector Machine?

There are 1 best solutions below

Related Questions in LINEAR-ALGEBRA

Related Questions in OPTIMIZATION

Related Questions in CONVEX-OPTIMIZATION

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions