Can Adding a Convex Term to a Non-Convex Optimization Problem Make it Convex?

369 Views Asked by At

In Machine Learning optimization problems - a "regularization" term is often added to the optimization problem to reduce overfitting:

enter image description here

I have noticed that in the case of the L2-Norm regularization, this term (i.e. a function) can be considered as a Convex Term as it is basically "quadratic" in nature.

My Question: In the L2-Norm case, the optimization problem without this regularization term is likely a Non-Convex problem - but we then add a Convex Term to this problem. Do we know if doing this (i.e. adding the Convex Term to a Non-Convex Optimization Problem) automatically makes the optimization problem as Convex?

I do not think that this is the case, seeing as:

  • Convex Optimization Problems are generally easier to solve than Non-Convex Optimization Problems

  • Anecdotally, I have heard of Regularized Loss Functions (e.g. for Neural Networks) that are considered to be "very difficult" optimization problems - even though they have this Convex Term. This informally leads me to believe that in the case of L2 Regularization, the fundamental optimization problem remains Non-Convex.

However, "anecdotal and informal logic" is generally never acceptable in understanding mathematics.

Can someone please comment on this?

Thanks!

1

There are 1 best solutions below

4
On

Yes, adding a large enough convex term can make a problem convex. For example, consider the nonconvex function $-x^2$ and the convex function $x^2$. For constant $\lambda \ge 1$, the sum $-x^2 + \lambda x^2=(\lambda-1)x^2$ is convex.

This is also a standard trick in binary quadratic programming, where $x_i$ is a binary decision variable and the objective is to minimize the multivariate quadratic function $$\sum_i \sum_j q_{ij} x_i x_j + \sum_i c_i x_i$$ subject to linear constraints. Let $\lambda$ be the absolute value of the smallest (negative) eigenvalue of $Q=(q_{ij})$. Then adding $\lambda(x_i^2-x_i)$, which is $0$ when $x_i$ is binary, makes the objective function convex.