Hessian of non-differentiable function

1.1k Views Asked by At

Given a function $f = \max\{f_1,f_2\}$ with $f_1,f_2$ convex and differentiable, I know I can calculate the subgradient of $f$.

Is there also an equivalent of the subgradient for the (sub)Hessian?

I am interested in the Hessian since I am trying to solve an convex optimization problem. I heard that most optimizer can perform better if the Hessian is given.

1

There are 1 best solutions below

0
On

I would say it is fallacious to say, as you have, that "most optimizer [sic] can perform better if the Hessian is given." Only optimizers designed to accept second-order information would benefit from the computation of a Hessian, assuming one was available. And because such optimizers will typically assume that a function is differentiable, $f$ would not be compatible with true smooth solvers, in theory.

That said, because your function is "close" to differentiable in some sense, you could just try feeding it to a standard, smooth, second-order solver and see what happens. If it is robust enough, it might work just fine for you. If you do this, just create a "false" gradient and Hessian like this, for instance: $$\nabla f(x) = \begin{cases} \nabla f_1(x) & f_1(x) > f_2(x) \\ \nabla f_2(x) & f_1(x) < f_2(x) \\ \tfrac{1}{2} \nabla f_1(x) + \tfrac{1}{2} \nabla f_2(x) & f_1(x) = f_2(x) \end{cases}$$ $$\nabla^2 f(x) = \begin{cases} \nabla^2 f_1(x) & f_1(x) > f_2(x) \\ \nabla^2 f_2(x) & f_1(x) < f_2(x) \\ \tfrac{1}{2} \nabla^2 f_1(x) + \tfrac{1}{2} \nabla^2 f_2(x) & f_1(x) = f_2(x) \end{cases}$$ It is important for me to emphasize these are not true gradients and Hessians. It is true that $\nabla f(x) \in \partial f(x)$, and as we will see below, there are equivalent notions for "sub-Hessians". Nevertheless, I think you'd find that might actually work. You never know.

There really are better, theoretically correct, ways to handle this. First, consider looking at something called semismooth Newton methods. Google will be your friend here; here is a PDF I cannot vouch for, but hey, it was near the top of the Google search. To greatly oversimplify, there is indeed the concept of a "subhessian" that is similar to a subgradient; and the structure of a semismooth Newton method is not dissimilar to a standard Newton method. Furthermore, when you are in any region where $f_1(x)\neq f_2(x)$, you're going to take standard Newton steps, albeit with more care about the step size.

EDIT: Here is a paper from the Journal of Optimization Theory and Applications that deals directly with the exact problem form you're proposing here (only it allows for more than two terms in the maximum). It looks like this particular minimax form is/was S. Schienberg's dissertation topic. It uses subgradient and subhessians, so I suspect it is related to semismooth methods. But I don't have access to the paper at this point.

Another approach would be to use a constrained optimization engine, and solve this problem: $$\begin{array}{ll} \text{minimize} & t \\ \text{subject to} & f_1(x) \leq t \\ & f_2(x) \leq t \end{array}$$ You'll have to convince yourself that this is equivalent to your problem; or you can trust me. Now the objectives and constraints are all smooth and differentiable, and any smooth constrained optimization engine can handle it.