Chain rule for subdifferential

3.5k Views Asked by At

I have two functions $g(x)$ and $h(z)$ where $h:\mathbb R^n\to \mathbb R$ and $g: \mathbb R\to \mathbb R$. Both are convex, neither are smooth. How can I apply the chain rule to find $\partial (g\circ h)$? (in terms of $\partial g$ and $\partial h$).

What I have so far: the definition for subdifferential (that's all) $$ \partial(g\circ h)(x) = \{ z : g(h(y)) \geq g(h(x)) + z^T(y-x), \forall y\} $$

If in fact $h$ was smooth, we do have this result: $$ \partial(g\circ h)(x) = \nabla h(x)^T \partial g(h(x)) $$

My guess is that the answer will be something like $\partial(g\circ h)(x)=S$ where $$ S = \{b \cdot a: a\in \partial h(x), b\in \partial g(h(x))\} $$ though I'm not sure...

Also does the problem become easier if I restrict the domain of $g$ to nonnegative scalars and claim that $g$ is monotonic?

Edit: Ok one of the special cases I am thinking of is that $h$ is a norm, and is nonsmooth only at 0, with $h(0)=0$. So, we only need to consider $\partial(g\circ h)(0)$; the rest follows the chain rule for smooth $h$. If, in addition, $g$ is monotonic, then $$ a\in \partial h(0) \iff a^Ty \leq h(y) \forall y. $$ $$ b\in \partial g(0) \Rightarrow b(a^Ty) \leq g(a^Ty) \leq g(h(y)) $$ which gives $S \subseteq \partial (g\circ h)$.

The general case is still open!

1

There are 1 best solutions below

0
On BEST ANSWER

Check out Corollary 16.72 in the book by Bauschke and Combettes (second edition), which states:

Let $f\colon H\to\mathbb{R}$ be continuous and convex, and let $\phi$ be lower semicontinuous, convex, and increasing on the range of $f$. Suppose that (the relative interior of the range of $f$ + the positive reals) intersected with the relative interior of the domain of $\phi$ is nonempty. Let $\bar{x}$ be in $H$ such that $f(\bar{x})$ is in the domain of $\phi$. Then $$ \partial (\phi\circ f)(\bar{x}) = \left\{ \alpha u \mid (\alpha,u)\in\partial\phi(f(\bar{x}))\times\partial f(\bar{x})\right\}.$$

So your conjecture is true, with some assumptions. The proof is nontrivial and makes use of coderivatives.