How to show $\partial f(x) =\{\nabla f(x) \}$ for a convex differentiable function?

1.6k Views Asked by At

I want to show that if $f:\mathbb{E}\rightarrow\mathbb{R}$ is convex, and differentiable at $x$, then $\partial f(x) = \{ \nabla f(x) \} $ .

I understand that for a convex function, we have the following:

$$f(y) \le f(x)+\nabla f^T(y-x) \forall x,y \in dom \, f$$

and I know that the definition of the subdifferential of $f$ at $x$ is:

$$ \partial f(x) = \{ z | f(y) \ge f(x)+<z,y-x> \forall x,y \in dom\, f \} \}$$

So, I'm able to say that if $f$ is convex, then certainly $\nabla f(x) \in\partial f(x)$. But I do not understand how to show $\partial f(x) = \{ \nabla f(x) \} $?

(as a bit of a side note, I do not understand how we know that $f'(x)=\nabla f$? I know that it is differentiable at $x$, but that does not necessarily mean that the gradient exists right.....?

A function can be differentiable in all directions, and the gradient not exist, and we did not say that $f$ was continuous)

2

There are 2 best solutions below

5
On BEST ANSWER

Define the function $$\phi(t):=f(x+t(y-x))$$ for $t\in [0,1]$. Since $f$ is convex implies $$\phi(t)\leqslant (1-t)f(x)+tf(y)=(1-t)\phi(0)+t\phi(1)$$ hence $\phi(t)$ is a convex function on $[0,1]$. Moreover differentiability of $f$ implies that $\phi$ is also differentiable. In particular we have $$\phi'(0)=\langle\nabla f(x),y-x\rangle$$ Convexity of $\phi$ yields $$\phi'(0)\leqslant \phi(1)-\phi(0)=f(y)-f(x)\Rightarrow\langle \nabla f(x),y-x\rangle\leqslant f(y)-f(x)$$ Therefore by definition $\nabla f(x)\in\partial f(x)$. Now suppose that $z\in \partial f(x)$ then $$f(y)\geqslant f(x)+\langle z,y-x\rangle$$ for all $y$. Let $y:=x+tw$ and define the function $$\psi(t):=f(y)-f(x)-\langle z,y-x\rangle=f(x+tw)-f(x)-\langle z,tw\rangle$$ This gives $$\psi'(t)=\langle\nabla f(x), w\rangle-\langle z,w\rangle$$ Since $z\in\partial f(x)$ then $\psi(t)\geqslant 0$ for all $t$. Also $\psi(0)=0$ hence at $t=0$ we have a minimum for $\psi$. This implies $\psi'(0)=0$ which is equivalent to saying $$\langle\nabla f(x), w\rangle=\langle z,w\rangle$$ Since the last equation holds for any arbitrary $w$ then $\nabla f(x)=z$. This proves that in fact $$\partial f(x)=\{\nabla f(x)\}$$

1
On

$\text{First part of the proof is simple. I prove the second part. Suppose d is a vector of norm 1 and u} \\ \text{is a subgradient of f at x which is not} \ c \nabla f(x) \ \text{for some constant c (in that case} \\ \text{the proof is similar and simple)} \ \\ \text{By Taylor theorem we have:} \\ f(x + \epsilon d) = f(x) + \epsilon \langle\nabla f(x) , d\rangle + o(\epsilon) =\\ f(x) + \epsilon \langle\nabla f(x) , d\rangle + \epsilon \langle u , d\rangle - \epsilon \langle u , d\rangle + o(\epsilon)= \\ f(x) + \langle u,\epsilon d\rangle + \epsilon(\langle\nabla f(x) , d\rangle - \langle u,d\rangle + \frac{o(\epsilon)}{\epsilon}) \\ \text{We can find a vector d such that}\langle\nabla f(x) , d\rangle \,< 0 \, \text{and} \, \langle u,d\rangle \, > 0. \, \text{Now by choosing a very small positive} \, \epsilon \, \text{we have} \\ f(x + \epsilon d) < f(x) + \langle u , \epsilon d\rangle \, \text{which is a contradiction.} $