Infinite Dimensional Optimization in Support Vector Machines

110 Views Asked by Bumbble Comm At 30 Mar 2026 - 2:08

In the kernelized form of the SVM, we are trying to solve the problem

\begin{align} \min_{w \in H, \, b \in R} \quad &\frac{1}{2} ||w||^2 + C \sum_{i = 1}^N \xi_i \\ \text{s.t.} \quad &y_i f(x_i) \geq 1 - \xi_i, \quad \forall i \\ &\xi_i \geq 0 \quad \forall i \end{align}

where $f (x) := \left<w, h(x)\right> + b$, and $h$ is a mapping that expands the original feature space $X$ into the enlarged space $H$. The space $H$ is equipped with the inner product $\left< \cdot, \cdot \right>$ and by extension the norm $||\cdot||$.

However, when deriving the dual form of this problem, all the sources I've run into seem to treat $h (x)$ as finite dimensional and $\left<w, h(x)\right>$ as a standard dot-product, whereas the most popular kernel, the RBF kernel, actually projects data points $x$ into an infinite-dimensional space, where the inner product is the $L^2$-norm. In particular, these sources when solving for the dual simply "differentiate" the Lagrangian "with respect to" $w$, even though this is not formally correct.

What is really going on here? I have not taken a course on functional analysis or calculus of variations so the process of optimizing a function of infinitely many variables is foreign to me. Any clarifications or directions to appropriate source materials would be greatly appreciated!

Original Q&A

Infinite Dimensional Optimization in Support Vector Machines

Related Questions in OPTIMIZATION

Related Questions in CALCULUS-OF-VARIATIONS

Trending Questions

Popular # Hahtags

Popular Questions