Does this strong convexity estimate hold?

172 Views Asked by At

Let $F:[0,\infty) \to [0,\infty)$ be a $C^2$ strictly convex function, and let $r_0<r_1$ be positive fixed constants. Let $$a<r_0<r_1<c<b, \tag{1}$$ and let $\lambda \in [0,1]$ satisfy $ \lambda a +(1-\lambda)b=c. $

Set $D(a,b,c)=\lambda F(a)+(1-\lambda)F(b)-F(c) $.

Question: Does there exist a constant $m>0$ (which may depend on $f,r_0,r_1$ but not on $a,b,c$) such that $ D(a,b,c) \ge m\lambda(1-\lambda)(r_1-r_0)^2 $ for any choice of $a,b,c$ satisfying condition $(1)$?

Here is the key point:

If $f'' \ge m$, then $f$ is strongly convex with parameter $m$, so $$ D(a,b,c) \ge \frac{1}{2}m\lambda(1-\lambda)(b-a)^2 \ge \frac{1}{2}m\lambda(1-\lambda)(r_1-r_0)^2 \tag{2} $$ as required. However, in our case, $c$ and $b$ can be arbitrarily large, and $F$ can become "less convex" (closer to being affine) when $x \to \infty$. In other words, if $\lim_{x \to \infty}F''(x)=0$, then the lower bound $(2)$ becomes the trivial bound $$ D(a,b,c) \ge \frac{1}{2} (\inf F'')\lambda(1-\lambda)(b-a)^2=0. $$

So, "naive application" of strong convexity does not apply here as is. However, my intuition is that even if $\lim_{x \to \infty}F''(x)=0$, we should somehow encounter "the strong convexity content" which lies between the fixed $r_0$ and $r_1$ so the "convexity gap" $D(a,b,c)$ should be bounded away from zero.

I thought to express $D(a,b,c)$ as some integral of $F''$ over a domain which contains $[r_0,r_1]$ but so far without success.

2

There are 2 best solutions below

12
On BEST ANSWER

If suffices to require that $F$ is strictly convex and differentiable on an interval $I \subset \Bbb R$. (Even the differentiability requirement can be dropped, see the remarks at the end of the answer.)

For $a, b \in I$ with $a < b$ and $c = \lambda a + (1 - \lambda) b$ with $0 \le \lambda \le 1$ we can write $$ D(a, b, c) = \lambda F(a)+(1-\lambda)F(b)-F(c) \\ = \lambda \bigl \{ F(a) - F(c) - (a-c)F'(c) \bigr\} + (1- \lambda) \bigl \{F(b) - F(c) - (b-c)F'(c)\bigr\} \, . $$

This suggests to introduce $$ H(u, v) = F(u) - F(v) - (u-v) F'(v) $$ for $u, v \in I$. $H$ has the following properties:

  1. $H(u, v) > 0$ if $u \ne v$.
  2. $H(u_1, v) > H(u_2, v)$ if $u_1 < u_2 \le v$, i.e. $H(u,v)$ is decreasing in $u$ as long as $u \le v$.
  3. $H(u, v_1) < H(u, v_2)$ if $u \le v_1 < v_2$, i.e. $H(u, v)$ is increasing in $v$ as long as $u \le v$.

Property (1) is a direct consequence of the strict convexity: $F(u)$ is larger than the corresponding value of the tangent line at $x=v$.

For property (2) we assume $u_1 < u_2 \le v$ and compute $$ H(u_1, v) - H(u_2, v) = F(u_1) - F(u_2) - (u_1 - u_2) F'(v) \\ \ge F(u_1) - F(u_2) - (u_1 - u_2) F'(u_2) = H(u_1, u_2) > 0 \, . $$ Here we used that $F'$ is increasing.

For property (3) we assume $u \le v_1 < v_2$ and compute $$ H(u,v_1) - H(u, v_2) = -F(v_1) - (u-v_1)F'(v_1) + F(v_2) + (u-v_2) F'(v_2) \\ \le -F(v_1) - (u-v_1)F'(v_2) + F(v_2) + (u-v_2) F'(v_2) \\ = -H(v_1, v_2) < 0 \, . $$

With these tools, estimating $D(a, b, c)$ from below becomes easy. If $a \le r_0 < r_1 \le c < b$ then $$ D(a, b, c) = \lambda H(a, c) + (1-\lambda)H(b,c) \\ \ge \lambda H(a, c) \ge \lambda H(r_0, r_1) \ge \lambda(1- \lambda) H(r_0, r_1) \\ = m \lambda(1-\lambda) (r_1-r_0)^2 $$ with $m$ defined as $$ m = \frac{H(r_0, r_1)}{(r_1-r_0)^2} = \frac{F(r_0) - F(r_1) - (r_0 - r_1) F'(r_1)}{(r_1-r_0)^2} > 0 \, . $$

Remarks:

  • The assumption that $F$ is only defined on $[0, \infty)$ with values in $[0, \infty)$ was not used in the proof.
  • The differentiability requirement can also be dropped. A convex function has one-sided derivatives in every inner point of the interval. The above proof still works if we replace $F'$ by the right (or left) derivative.
0
On

Alternative solution

Let us prove that the best constant $m$ is $$m = \frac{F(r_0) - F(r_1) - F'(r_1)(r_0 - r_1)}{(r_1 - r_0)^2} > 0.$$ (Note: Actually, it is equal to $\frac{1}{(r_1 - r_0)^2}\int_{r_0}^{r_1} (x- r_0) F''(x) \mathrm{d} x$ which is positive since $F(x)$ is strictly convex.)

First, we rephrase the problem as follows:

Let $F : [0, \infty) \to [0, \infty)$ be a $\mathrm{C}^2$ strictly convex function. Let $0 < r_0 < r_1$ be fixed constants. Does there exist a constant $m > 0$ such that $$\lambda F(a) + (1 - \lambda)F(b) - F(\lambda a + (1 - \lambda)b) \ge m \lambda (1 - \lambda) (r_1 - r_0)^2$$ for any real numbers $a, b, \lambda$ satisfying $$0 < \lambda < 1, \quad 0 \le a < r_0 < r_1 < \lambda a + (1 - \lambda) b\ ?$$

Second, we have \begin{align} &\inf_{0 < \lambda < 1,\ 0 \le a < r_0 < r_1 < \lambda a + (1 - \lambda) b} \frac{\lambda F(a) + (1 - \lambda)F(b) - F(\lambda a + (1 - \lambda)b)}{\lambda (1 - \lambda)}\\ =\ & \inf_{0 < \lambda < 1, \ 0 \le a < r_0} \left(\inf_{b > \frac{r_1 - \lambda a}{1 - \lambda}} \frac{\lambda F(a) + (1 - \lambda)F(b) - F(\lambda a + (1 - \lambda)b)}{\lambda (1 - \lambda)}\right)\\ =\ & \inf_{0 < \lambda < 1, \ 0 \le a < r_0} \frac{\lambda F(a) + (1 - \lambda)F(\frac{r_1 - \lambda a}{1 - \lambda}) - F(r_1)}{\lambda (1 - \lambda)} \tag{1}\\ =\ & \inf_{0 < \lambda < 1} \left(\inf_{0 \le a < r_0} \frac{\lambda F(a) + (1 - \lambda)F(\frac{r_1 - \lambda a}{1 - \lambda}) - F(r_1)}{\lambda (1 - \lambda)} \right)\\ =\ & \inf_{0 < \lambda < 1} \frac{\lambda F(r_0) + (1 - \lambda)F(\frac{r_1 - \lambda r_0}{1 - \lambda}) - F(r_1)}{\lambda (1 - \lambda)} \tag{2}\\ =\ & \inf_{y > r_1} \frac{(y - r_0)(y - r_1)F(r_0) + (r_1 - r_0)(y - r_0)F(y) - (y - r_0)^2F(r_1)}{(r_1 - r_0)(y - r_1)} \tag{3}\\ =\ & \lim_{y \to r_1} \frac{(y - r_0)(y - r_1)F(r_0) + (r_1 - r_0)(y - r_0)F(y) - (y - r_0)^2F(r_1)}{(r_1 - r_0)(y - r_1)} \tag{4}\\ =\ & F(r_0) - F(r_1) - F'(r_1)(r_0 - r_1). \tag{5} \end{align} Explanations:
(1): By letting $f(b) = (1 - \lambda)F(b) - F(\lambda a + (1 - \lambda)b)$, we have $f'(b) = (1 - \lambda)F'(b) - (1 - \lambda) F'(\lambda a + (1 - \lambda)b) \ge 0$ (note: $F'(x)$ is non-decreasing) and thus $f(b)$ is non-decreasing on $[b, \infty)$.
(2): By letting $g(a) = \lambda F(a) + (1 - \lambda)F(\frac{r_1 - \lambda a}{1 - \lambda})$, we have $g'(a) = \lambda F'(a) - \lambda F'(\frac{r_1 - \lambda a}{1 - \lambda}) \le 0$ (note: $F'(x)$ is non-decreasing) and thus $g(a)$ is non-increasing on $[0, r_0)$.
(3): Use the substitution $y = \frac{r_1 - \lambda r_0}{1 - \lambda}$.
(4): Use the following fact (the proof is given at the end):
Fact 1: Let $$g(y) \triangleq \frac{(y - r_0)(y - r_1)F(r_0) + (r_1 - r_0)(y - r_0)F(y) - (y - r_0)^2F(r_1)}{(r_1 - r_0)(y - r_1)}.$$ Then $g'(y) \ge 0$ on $(r_1, \infty)$.
(5) Apply L'Hopital rule.

We are done.

$\phantom{2}$

Proof of Fact 1: We have, for $y > r_1$, \begin{align} (r_1 - r_0)(y - r_1)^2g'(y) &= (y - r_1)^2F(r_0) - (r_1 - r_0)^2F(y)\\ &\quad + (r_1 - r_0)(y - r_0)(y - r_1)F'(y)\\ &\quad + (-2(y - r_0)(y - r_1) + (y - r_0)^2)F(r_1) \\ &= (y - r_1)^2F(r_0) - (r_1 - r_0)^2( F(y) - F(r_1) ) \\ &\quad - (r_1 - r_0)^2F(r_1) + (r_1 - r_0)(y - r_0)(y - r_1)F'(y)\\ &\quad + (-2(y - r_0)(y - r_1) + (y - r_0)^2)F(r_1)\\ &\ge (y - r_1)^2F(r_0) - (r_1 - r_0)^2(y - r_1)F'(y) \\ &\quad - (r_1 - r_0)^2F(r_1) + (r_1 - r_0)(y - r_0)(y - r_1)F'(y)\\ &\quad + (-2(y - r_0)(y - r_1) + (y - r_0)^2)F(r_1)\\ &= (y - r_1)^2F(r_0) - (y - r_1)^2F(r_1) + (r_1 - r_0)(y - r_1)^2F'(y) \\ &\ge (y - r_1)^2F(r_0) - (y - r_1)^2F(r_1) + (r_1 - r_0)(y - r_1)^2F'(r_1)\\ &= (y - r_1)^2[F(r_0) - F(r_1) - F'(r_1)(r_0 - r_1)]\\ &\ge 0 \end{align} where we have used $(y - r_1)F'(y) \ge F(y) - F(r_1)$ and $F(r_0) - F(r_1) - F'(r_1)(r_0 - r_1) \ge 0$ and $F'(y) \ge F'(r_1)$ (Note: $F(x) \ge F(y) + F'(y)(x-y)$ for convex functions; $F'(x)$ is non-decreasing.). We are done.