Proof of the hinge loss being 1-Lipschitz

1.7k Views Asked by Bumbble Comm At 26 Mar 2026 - 1:01

Page 7 of https://web.stanford.edu/class/cs229t/scribe_notes/10_10_final.pdf

I've tried finding a proof online, but haven't been able to find it. In the notes above which are provided as part of Stanford's Statistical Learning Theory, the hinge loss is defined as: $$ l(z,h) = max(0,1-y_ih(x_i) ) $$ where $z = (x,y)$, and $h$ is some hypothesis.

Is it possible to provide a proof that this is $1$-Lipschitz? Furthermore, in general, is there a particular method to show this for functions that may not have a gradient defined at every point?

I hope someone can correct my (wrong) intuition: If a function $f(x)$ is $K$-Lipschitz, then the magnitude of its gradient is bounded by $K$. But in this case, why is the gradient of the hinge loss necessarily bounded? Can't we choose two points $x_1$ and $x_2$ which are arbitrarily close, but correspond to different labels such that $l(z_1,h) = 0$ but $l(z_2,h) = 1$?

Original Q&A

Proof of the hinge loss being 1-Lipschitz

Related Questions in REAL-ANALYSIS

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions