Can we derive bound for the subgradient of stochastic convex function from the bound for the subgradient of its sample function?

29 Views Asked by At

Suppose $f:\mathbb R^n\times\Omega\to\mathbb R$ is a function on $\mathbb R^n$ and measurable space $\Omega$ such that for each $\xi\in\Omega$, $f(\cdot,\xi)$ is a convex function on $\mathbb R^n$. Our assumption here is that $\|\partial f(x,\xi)\|\leq M$ for every $(x,\xi)\in\mathbb R^n\times \Omega$, where $\partial f$ is understood as a set-valued mapping of $x$ and $\xi$ is fixed. And the norm of a set is defined as the maximal norm of its element. Of course, it's not hard to prove that $F(x):=E_{\xi}[f(x,\xi)]$ is convex but not necessarily real-valued. Nevertheless, we assume $F$ is real valued here. Hence $\partial F(x)$ is nonempty everywhere.

My question is that whether we have $\|\partial F(x)\|\leq M$ for every $x\in\mathbb R^n$. This is claimed in some ML papers and textbooks, but unfortunately I haven't seen any rigorous proof so I am not convinced by this result. Can anyone help me to give a rigorous proof? Additional assumption can be added if necessary.

1

There are 1 best solutions below

4
On BEST ANSWER

This is true.

The key point to notice is that (what you meant by) $\|\partial f(x,\xi)\|\leq M$ is equivalent to function $f(\cdot,\xi)$ being globally $M$-Lipschitz, due to its convexity. From here, by the same argument, $\|\partial F(x)\|\leq M$.

Hope this helps.