Gradient always points away from the minima of convex functions?

606 Views Asked by At

I'm reading a paper by Léon Bottou, "Online Learning and Stochastic Approximations". He studies online learning with cost functions $C(w)$ satisfying two conditions:

  1. $C(w)$ has a single minimum $w^*$.
  2. $C(w)$ satisfies $\inf_{||w-w^*||>\epsilon} (w-w^*)\cdot\nabla_w C(w) > 0$ for all $\epsilon>0$ and all $w$ in its domain. In words, the gradient always points away from the minimum.

Here $w$ is a vector in $\mathbb R^n$.

The author states that "this condition is weaker than the usual notion of convexity." But it's not completely clear. My question is this:

Do all strictly convex functions $C(w)$ satisfy these conditions?

The first condition is trivial since all strictly convex functions have a single minimum (excluding cases where the extrema are at the boundary of the domain). So my question is really about the second condition.

1

There are 1 best solutions below

1
On BEST ANSWER

Not all strictly convex functions have minimum, but assume it does. Then it is unique, true. Furthermore, the expression $(w^*-w)\cdot\nabla_w C(w)$ is a scalar product between the vector from the point $w$ to the minimum and the gradient at the point. The gradient is orthogonal to the level set $\{x\colon f(x)=f(w)\}$ and points outwards, the sublevel set $\{x\colon f(x)\le f(w)\}$ is convex, and the minimum $w^*$ is in the interior of the sublevel set as $f(w^*)<f(w)$. It means that the gradient and the vector $w^*-w$ point strictly at the opposite half-spaces w.r.t. the tangent hyperplane. Hence, yes, if a strictly convex function is differentiable then it does satisfy $-(w^*-w)\cdot\nabla_w C(w)>0$.