Does strictly convex imply invertible gradient?

2.9k Views Asked by At

If $f:\mathbb R^n \to \mathbb R$ is strictly convex and continuously differentiable, does this imply that $\nabla f$ is a one-to-one mapping? To be precise, can we say that $x, y \in \mathbb R^n$ and $\nabla f(x) = \nabla f(y)$ implies $x = y$?

3

There are 3 best solutions below

3
On

No. Try $x \mapsto \mathrm{e}^x$.

1
On

Suppose that there exist $x, y \in \mathbb R^n$ such that $\nabla f(x) = \nabla f(y)$ and $x \neq y$.

Then, by the strict convexity of $f$, we can write \begin{equation} \nabla f(x) \cdot (y-x) < f(y) - f(x) \end{equation} and similarly \begin{equation} \nabla f(y) \cdot (x-y) < f(x) - f(y). \end{equation} Multiplying both sides of the latter inequality by $-1$ and substituting $\nabla f(x)$ in place of $\nabla f(y)$, we obtain \begin{equation} \nabla f(x) \cdot (y - x) > f(y) - f(x), \end{equation} which contradicts the first inequality.

Thus, if $x, y \in \mathbb R^n$ satisfy $\nabla f(x) = \nabla f(y)$, then $x =y$.

0
On

This question has a big intersection with an answer of mine to a question in Physics Stack Exchange concerning the equivalence of the Lagrangian and Hamiltonian formalism. Actually there my resoning was a bit different regarding the hypotheses.

In that case $f$ was the Lagrangian function with $t, q$ fixed, as function of $\dot{q}$. I write down here my idea referred to that setup.

One can conclude a little more than what was requested in the original issue if assuming that

  • $f$ is $C^2$,
  • relaxing the hypothesis on the domain by supposing that $f: A \to \mathbb{R}$ with $A\subset \mathbb{R}^n$ is open and convex.
  • $f$ is convex.

Saying that $f$ is a $C^2$ convex function I mean above that the Hessian matrix $$H(x) = \left[\frac{\partial^2 f}{\partial x_i \partial x_j} \right]$$ is everywhere positively defined on its domain $A$.

We can prove that, under the three hypothes above,

(1) $B:= \nabla f(A) \subset \mathbb{R}$ is open;

(2) $$F: A \ni x \mapsto \nabla f(x) \in B$$ is a $C^1$ diffeomorphism.

In fact, as I prove below $F$ satisfy $dF \neq 0$ in $A$, so that the inverse map theorem proves that $F$ is a local $C^1$- diffeomorphism. In particular $F$ is open and thus $B$ is open as well. Since, as I also prove below, $F$ is also injective, it is a $C^1$ diffeomorphism from $A$ to $B$.

Proof of $dF \neq 0$. If $dF(x_0)=0$, then $dF(x_0) X=0$ for every $X\in \mathbb{R}^n$, that is $H(x_0)X =0$ and thus $X^t H(x_0) X =0$, in particular for some $X\neq 0$. This is in contraddiction with the fact that $H(x)$ is everywhere positively defined.

Proof of injectivity of $F$. If $F(x)=F(y)$, and $x\neq y$, consider the segment $z(t) = x + t (y-x)$ for $t\in [0,1]$ which is completely included in $A$ it being convex, and define the $C^1$ map $$[0,1]\ni t \mapsto (y-x)^t F(z(t)) \in \mathbb{R}.$$ It holds $(y-x)^t F(z(1))= (y-x)^t F(z(0))$ and thus $\frac{d (x-y)^t F(z(t))}{dt}|_{t_0}=0$ for some $t \in (0,1)$. That is $$t_0(x-y)^t dF(z(t_0))(y-x)=0 $$
and thus $$(y-x)^t H(z(t_0))(y-x) = (y-x)^t dF(z(t_0))(y-x)=0$$ which is impossible as before since $y-x \neq 0$.