Prove that any subgradient of the hinge loss function $max\lbrace 0, y\langle\mathbf{w}, \mathbf{x}\rangle\rbrace$ is of the form $\alpha\mathbf{x}$

55 Views Asked by At

I tried to prove the following claim,

Let $f(\mathbf{w})=max\{0,1−y\langle\mathbf{w}, \mathbf{x}\rangle\}$, where $y\in\lbrace -1, 1\rbrace$, be a hinge loss function, then any subgradient of $f$ at $\mathbf{w}$ is of the form $\alpha \mathbf{x}$ where $|\alpha| \le 1$.

But haven't got any success. Here's what I've accomplished so far.

Suppose $f$ is differentiable at $(\mathbf{w})$ then either $\nabla f=y\mathbf{x}$ or $\nabla f=0$ and it's the only subgradient of $f$ that satisfies the condition in the claim. Now suppose $f$ is not differentiable at $\mathbf{w}$ and let $\mathbf{v}\in\partial f(\mathbf{w})$, then $\forall \mathbf{u}$ we have that

$$ f(\mathbf{u})\ge f(\mathbf{w}) + \langle\mathbf{u}-\mathbf{w}, \mathbf{v}\rangle. $$

Let $\mathbf{u}=\mathbf{w}+\delta\mathbf{x}, \delta\in\mathbb{R}$, then

$$ \begin{align} f(\mathbf{u})-f(\mathbf{w})&\ge\langle\mathbf{u}-\mathbf{w}, \mathbf{v}\rangle\\ \Leftrightarrow f(\mathbf{w}+\delta\mathbf{x})-f(\mathbf{w})&\ge \langle\mathbf{w}+\delta\mathbf{x}-\mathbf{w}, \mathbf{v}\rangle\\ \Leftrightarrow f(\mathbf{w}+\delta\mathbf{x})-f(\mathbf{w})&\ge\langle\delta\mathbf{x}, \mathbf{v}\rangle\tag{1}\label{eq1}. \end{align} $$

Because maximum of continuous functions is continuous therefore let $\delta\to 0$ we have LHS \eqref{eq1}$\to 0$ and so $0\ge\delta\langle\mathbf{x}, \mathbf{v}\rangle$. I don't know how to go further from this.

1

There are 1 best solutions below

0
On

Here's my lengthy proof of that claim (the book where I found this claim says that "it's easy to show", so I'd appreciate if anyone can share a shorter proof).

Let $\mathbf{v}\in\partial f(\mathbf{w})$, we can represent $\mathbf{v}=c\mathbf{x}+\mathbf{v_0}$ where $c=\frac{\langle\mathbf{v}, \mathbf{x}\rangle}{||\mathbf{x}||^2}$ and $\langle\mathbf{v_0}, \mathbf{x}\rangle=0$.

Let $\mathbf{u}=\mathbf{w}+t\mathbf{v_0}$, where $t\in\mathbb{R}_+$ then

$$ \begin{align} f(\mathbf{u})&\ge f(\mathbf{w})+\langle\mathbf{u}-\mathbf{w}, \mathbf{v}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{v_0})&\ge f(\mathbf{w})+\langle t\mathbf{v_0}, \mathbf{v}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{v_0})&\ge f(\mathbf{w})+t\langle\mathbf{v_0}, c\mathbf{x}+\mathbf{v_0}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{v_0})&\ge f(\mathbf{w})+t\langle\mathbf{v_0}, \mathbf{v_0}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{v_0})&\ge f(\mathbf{w})+t||\mathbf{v_0}||^2\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{v_0})&\ge t||\mathbf{v_0}||^2\hspace{1em}\text{since }f(\mathbf{w})\ge 0, \forall\mathbf{w}.\tag{1}\label{ineq1} \end{align} $$

Suppose $||\mathbf{v_0}||\ne 0$, then \eqref{ineq1} implies $f(\mathbf{w}+t\mathbf{v_0})\gt0$. Let $s=f(\mathbf{w}+t\mathbf{v_0})\gt 0$ and $t=\frac{s+\epsilon}{||\mathbf{v_0}||^2}$, where $\epsilon\gt 0$, then \eqref{ineq1} implies $s\ge s + \epsilon\Leftrightarrow 0\ge\epsilon$ which is a contradition. So we must have that $||\mathbf{v_0}||=0$ or $\mathbf{v_0}=\mathbf{0}$ and so $\mathbf{v}=c\mathbf{x}$.

If $c=0$ then $\mathbf{v}$ satisfies the condition in the claim. So suppose $c\ne 0$, let $\mathbf{u}=\mathbf{w}+t\mathbf{x}$, where $t\in\mathbb{R}$ and $tc\gt 0$, then we have

$$ \begin{align} f(\mathbf{u})&\ge f(\mathbf{w})+\langle\mathbf{u}-\mathbf{w}, \mathbf{v}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{x})&\ge \langle t\mathbf{x}, c\mathbf{x}\rangle\\ \Leftrightarrow f(\mathbf{w}+t\mathbf{x})&\ge |tc|\langle\mathbf{x}, \mathbf{x}\rangle.\tag{2}\label{ineq2} \end{align} $$

Since $tc\gt 0$, \eqref{ineq2} implies that $f(\mathbf{w}+t\mathbf{x})\gt 0$ and so $f(\mathbf{w}+t\mathbf{x})=1-y\langle\mathbf{w}+t\mathbf{x}\rangle$. Therefore,

$$ \begin{align} \eqref{ineq2}&\Rightarrow 1-y\langle\mathbf{w}+t\mathbf{x}\rangle\ge|tc|\langle\mathbf{x}, \mathbf{x}\rangle\\ &\Rightarrow 1+|\langle\mathbf{w}+t\mathbf{x}, \mathbf{x}\rangle|\ge|tc|\langle\mathbf{x}, \mathbf{x}\rangle\\ &\Rightarrow 1 + |\langle\mathbf{w}, \mathbf{x}\rangle| + |t| |\langle\mathbf{x}, \mathbf{x}\rangle|\ge|t||c|\langle\mathbf{x}, \mathbf{x}\rangle\\ &\Rightarrow 1 + |\langle\mathbf{w}, \mathbf{x}\rangle| \ge|t|(|c|-1)\langle\mathbf{x}, \mathbf{x}\rangle\\ &\Rightarrow \frac{1}{|t|} \frac{1 + |\langle\mathbf{w}, \mathbf{x}\rangle|}{\langle\mathbf{x}, \mathbf{x}\rangle} \ge|c|-1.\tag{3}\label{ineq3} \end{align} $$

Let $|t|\to\infty$, \eqref{ineq3} implies that $0\ge|c|-1$ or $|c|\le 1$ which concludes our proof.