The Legendre Transform of a differentiable, strictly convex, and coercive function

1k Views Asked by At

I encountered the following unsubstantiated fact in a proof:

$f$ is differentiable, strictly convex, and coercive. Thus we may conclude $$(\nabla f)^{-1} = \nabla f^*$$ where $f^*$ is the Legendre transformation of $f$

Loosely speaking, I see how this might work: if a function is differentiable then Legendre trasformation will return the slope at some point $x \in \text{dom} f$. If $f$ is strictly coercive, then $\nabla f$ is monotonic, in the appropriate sense. So an inverse is defined. Loosely, I think the role of coercivity is to make the map between $x \in \text{dom} f$ and its slope at $x$ bijective. I can't understand how coercivity is causing this though.

My questions:

  1. Can someone provide a heuristic description what coercivity is giving us?
  2. Provide a proof of the fact above (or reference)? Assume $f: \mathbb{R}^d \rightarrow \mathbb{R}$.
2

There are 2 best solutions below

0
On BEST ANSWER

Recall that $f^*(y) = \sup \langle x,y\rangle -f(x)$. Assuming there is a minimizer $x^*$ we have, $f^*(y) = \langle x^*,y\rangle -f(x^*)$. At the same time, $x^*$ satisfies the optimality conditions $0=\nabla_x(\langle x,y\rangle -f(x))|_{x=x^*} = y-\nabla_x f(x^*)\implies y=\nabla_xf(x^*)$. If we take the gradient of $f^*(y)$ w.r.t. $y$ to get $\nabla_y f^*(y) = x^*(y)$. So we know that $\nabla f^*(y) = (\nabla f)^{-1}(x^*(y))$. But $x^*$ might not be unique.

So you want,

$$\arg\sup_x \langle x,y\rangle - f(x)=\{x^*\}\neq\emptyset$$

a singleton for all $y$. Coercivity in $x$ will ensure this since $f$ is also strictly convex (and thus $-f$ is strictly concave).

The injectivity of $\nabla f$ needed to define an inverse is a consequence of strict convexity and coercivity. Since $f$ is strictly convex, $\nabla f$ is strictly monotone (and thus injective). Notice that

$$\arg\sup_x \langle x,y\rangle - f(x) = \arg\min_x \langle x,-y\rangle + f(x)$$

Fix $y$ and let $\phi(x) = \langle x,-y \rangle + f(x)$. The sublevel sets of $\phi$ are bounded by coercivity and closed since $\phi$ is continuous. Taking any sublevel set (e.g. the sublevel set for $\phi(0)$), we have that $\phi$ is strictly convex and differentiable and thus admits a unique minimum on a compact set. Then, for any $y$, $x^*$ exists and is unique so that $\nabla f$ is surjective.

0
On
  1. You don't need coercivity. The result is true, e.g., for the exponential function which is not coercive.

  2. Check Chapter 26 in Rockafellar's Convex Analysis about the Legendre transformation.