Local characterization for convex $C^1$ functions

426 Views Asked by At

Let $f:\mathbb R^n\to \mathbb R$ be $C^1$ and satisfy following condition: For every $x\in\mathbb R^n$ there exists some $\varepsilon_x > 0$ such that for every $y$ with $\|y - x\| < \varepsilon_x$ it follows $$ f(y) \ge f(x) + \nabla f(x)^T (y - x). $$ Is $f$ convex?

Note: This is a follow-up question to 'Locally' Convex Function, whose accepted answer assumes $C^2$.

Obviously a proof for $n=1$ suffices.

2

There are 2 best solutions below

1
On BEST ANSWER

This is a much uglier proof than I had anticipated.

The key result is that if a real valued $h$ is convex on an interval and attains a $\max$ in interior of the interval then $h$ is constant on the interval.

It is sufficient to prove that $\phi:I \to \mathbb{R}$ is convex, where $I=[0,1]$ and $\phi$ is defined by $\phi(t) = f(ty+(1-t)x)$.

Let $\eta(t) = \phi(t) -l(t)$ where $l(t) = t \phi(1)+(1-t)\phi(0)$. Note that $\eta(0)=\eta(1) = 0$.

Let $t^* \in I$ maximise $\eta$ and suppose $\eta(t^*) > 0$. Note that $\eta(t) \le \eta(t^*)$ for $t \in I$. It is clear that the set $M= \{t \in I | \eta(t)=\eta(t^*) \}$ is closed. Note that $\eta'(t) =0 $ for all $t \in M$ (since all points are maximisers).

We will show that $M$ is open, hence connected and so equal to $I$ which will give a contradiction.

Pick $t' \in M$. There is some $\epsilon>0$ such that $\phi(t'+h)-\phi(t') \ge \phi'(t')h$ for $|h|< \epsilon$. Since $l'(t'+h)-l(t') = l'(t') h$, we see that $\eta(t'+h)-\eta(t') \ge \eta'(t')h = 0$ for $|h| <\epsilon$.

In particular, $\eta(t'+h) = \eta(t')$ for $|h| <\epsilon$ and so $(t'-\epsilon,t'+\epsilon) \subset M$. Hence $M$ is open which means that $\eta(0) = \eta(t^*)>0$, a contradiction.

Hence $\eta(t) \le 0 $ for all $I$ and so $f$ is convex.

Note that only differentiability was used.

0
On

Yep, $f$ is convex. Indeed, even the assumption that $f \in C^1$ can be dispensed with, and replaced with continuity and a local subgradient condition. That is, if, for every $x \in \Bbb{R}^n$, there exists some $v_x \in \Bbb{R}^n$ and $\varepsilon_x > 0$ such that $$\|y - x\| < \varepsilon_x \implies f(y) \ge f(x) + v_x^\top (y - x), \tag{$\star$}$$ then $f$ is convex.

We show that each local subgradient can be replaced by a global subgradient. That is, we may remove the condition $\|y - x\| < \varepsilon_x$ in $(\star)$. Fix $x_0 \in \Bbb{R}^n$ and consider the function $$g(y) = f(y) - v_{x_0}^\top(y - x_0) - f(x_0).$$ Note that $g$ satisfies the condition $(\star)$ as well, $g(x_0) = 0$, and $g$ has a local minimum at $x_0$.

Fix $y_0 \in \Bbb{R}^n \setminus \{x_0\}$. We wish to show that the conclusion of $(\star)$ holds when $y = y_0$ and $x = x_0$, even if we don't have $\|y_0 - x_0\| < \varepsilon$. In particular, we are showing that $g(x_0)$ is a global minimum, not just a local one.

Restricting $g$ to the line segment $[x_0, y_0]$, we know that $g$ is continuous, and hence it attains a global maximum on this interval. This maximum must be at least $0$, since $g(x_0) = 0$. Further, since $x_0$ is a local minimum of $g$, this maximum must be achieved somewhere other than $x_0$.

Pick a point $x_1 \in \operatorname{argmax}_{x \in [x_0, y_0]} g(x) \setminus \{x_0\}$. If $y_0$ lies in this set, then we are done, so assume this is not the case, and hence $x_1 \neq y_0$. As per $(\star)$, there exists some $w_{x_1} \in \Bbb{R}^n$ and $\varepsilon_{x_1} > 0$ such that $$\|y - x_1\| < \varepsilon_{x_1} \implies g(y) \ge g(x_1) + w_{x_1}^\top(y - x_1).$$ Note that, for sufficiently small $\lambda$, $z_\lambda := x_1 + \lambda(y_0 - x_0) \in [x_0, y_0]$ and $\|z_\lambda - x_1\| < \varepsilon$, hence $$g(x_1) \ge g(z_\lambda) \ge g(x_1) + w_{x_1}^\top(z_\lambda - x_1),$$ which implies that $$0 \ge w_{x_1}^\top(z_\lambda - x_1) = \lambda w_{x_1}^\top(y_0 - x_0)$$ for all sufficiently small $\lambda$. This implies that $w_{x_1}^\top(y_0 - x_0) = 0$, and hence, for sufficiently small $\lambda$, $$g(x_1) \ge g(z_\lambda) \ge g(x_1) + 0,$$ hence $z_\lambda$ also maximises $g$. That is, we have $x_1$ is in the interior of $\operatorname{argmax}_{x \in [x_0, y_0]} g(x)$, relative to the line segment $[x_0, y_0]$, and hence this $\operatorname{argmax}$ set is open in $[x_0, y_0]$. On the other hand, given the continuity of $g$, it is also closed, and hence must be the full line segment. But, we explicitly assumed $y_0$ did not maximise $g$ on the line segment, hence we have a contradiction. Thus, the maximum of $g$ occurs at $y_0$, hence $g$ achieves a global minimum at $x_0$.

This means that $f$ has a subgradient $v_x^\top$ at every point $x$. This implies the convexity of $x$. In particular, we may express: $$f(y) = \sup_{x \in \Bbb{R}^n} (f(x) + v_x^\top(y - x)),$$ which makes $f$ the pointwise supremum of affine functions, which is convex.