Geometric intuition of conjugate function

16.9k Views Asked by At

I am looking for a geometric and intuitive explanation of the conjugate function and how it maps to the below analytical formula.

$$ f^*(y)= \sup_{x \in \operatorname{dom} f } (y^Tx-f(x))$$

2

There are 2 best solutions below

2
On BEST ANSWER

I found Bertsekas' explanations quite simple and useful to understand many different things in convex analysis and optimization. You may want to check out his book "Convex Optimization Theory", or his notes for the MIT course, which also cover conjugacy.

The short explanation on page 7 of the notes is as follows:

Dual description of convex functions

  • Define a closed convex function by its epigraph.
  • Describe the epigraph by hyperplanes.
  • Associate hyperplanes with crossing points (the conjugate function).

enter image description here

Primal description: Values $f(x)$. Dual description: Crossing points $f^*(y)$.

0
On

Some intuition might come from thinking in terms of support functions. Given a set $A\subseteq \mathbb{R}^n$, its support function is given by $$h_A(x)\equiv \sup\{x\cdot a: \,\,a\in A\}.$$ The idea is simple: the support function attempts to describe a set by the set of its support hyperplanes. For each direction $x\in\mathbb{R}^n$, we project the set $A$ onto $x$, and look for the largest value of the projection. This will correspond to an hyperplane that has normal vector $x$, and is tangent to $A$. More precisely, $h_A(x)$ tells you that said tangent hyperplane is $$\{y\in\mathbb{R}^n : \,\, x\cdot y=h_A(x)\}.$$ If $A$ is convex, then it is completely characterised by the set of its support hyperplanes, and thus there is a bijection between $A$ and $h_A$.

Consider now a function $f:\mathbb{R}^n\to \tilde{\mathbb{R}}$, and its epigraph: $$\operatorname{epi}(f)\equiv \{(x,\alpha)\in\mathbb{R}^{n+1}: \,\, f(x)\le \alpha\}\subseteq \mathbb{R}^{n+1}.$$ Geometrically, this is the set of point that sit above (or on) the graph of $f$. A function $f$ is convex (as a function) iff its epigraph is (as a set).

Because we now have a convex set to deal with, the epigraph of $f$, a natural question is how can we characterise it via its support hyperplanes, i.e., what's its support function. We have $$h_{\operatorname{epi}(f)}(\mathbf x^*,y^*) = \sup_{\mathbf x\in\mathbb{R}^n, \,\, \alpha\ge f(\mathbf x)} \left(\mathbf x^*\cdot\mathbf x+y^* \alpha\right).$$ Note that I'm strictly following the definition of $h_A(x)$ I gave above.

But we can now observe something else: for the characterisation in terms of hyperplanes, we don't need to look at $h_A(\mathbf x^*)$ for all $\mathbf x^*\in\mathbb{R}^n$. Rather, it's enough to look at all directions. This can mean all $\mathbf x^*$ such that $\|\mathbf x^*\|=1$. But it can also mean, if we have a subset of $\mathbb{R}^{n+1}$ like we do here, that it's enough to only look at "direction vectors" of the form $(\mathbf x,\pm1)$ for all $\mathbf x\in\mathbb{R}^n$. But also, $h_{\operatorname{epi}(f)}(\mathbf x^*,1)$ is bound to always be infinite for convex $f$ (think of what the tangent hyperplanes of such functions look like). It is therefore sufficient to look at $h_{\operatorname{epi}(f)}(\mathbf x^*,-1)$. If we use the definition above, this is $$h_{\operatorname{epi}(f)}(\mathbf x^*,-1) \equiv \sup_{\mathbf x\in\mathbb{R}^n} \left(\mathbf x^*\cdot\mathbf x-f(\mathbf x)\right).$$ Note that we could remove the $\alpha\ge f(\mathbf x)$ part of the sup because maximising with respect to such $\alpha$ is trivial: we need the $\alpha$ that maximises $-\alpha$ and is such that $\alpha\ge f(\mathbf x)$, and this is clearly just $-f(\mathbf x)$.

We just found precisely the convex conjugate of $f$. We conclude that the convex conjugate of $f$ can be given the precise geometric interpretation of the function characterising the set of tangent hyperplanes to the (epi)graph of $f$.

See also Please explain the intuition behind the dual problem in optimization., and Physical interpretation and notions about conjugate function?.