Linking the "Formal Definition of Convexity" to the "Informal Definition of Convexity"

227 Views Asked by At

In mathematics, we told that a Convex Function is a function that only has one minimum (or maximum) point (i.e. a Convex Function can have no local minimas - only a single global minimum). For example, in Machine Learning applications, we are told that functions (e.g. Loss Functions) that are Non-Convex are difficult to optimize because they have many local minima points (where the optimizer can get "stuck") - but on the other hand, Convex functions are easier to optimize because they tend to have only a single minimum point (i.e. no local minimas) and are thus easier to optimize. Thus, Non-Convexity appears to be one of the most important factors in deciding whether or not an optimization problem is expected to be difficult or easy.

Informally, we are taught to verify if simple functions (e.g. 2 dimensional) are Convex by the "Secant Test" - we draw a straight line between any two points on a function and check to see if this line only intersects the function at these two points. If this is the case, we conclude that this function in question is Convex.

Mathematically, when we look at the formal definition of Convexity (https://en.wikipedia.org/wiki/Convex_function):

Let $X$ be a convex subset of a real vector space and let $f:X\to \mathbb {R}$ be a function.

Then $f$ is called convex if and only if any of the following equivalent conditions hold:

For all $0\leq t\leq 1$ and all $x_{1},x_{2}\in X$: $f\left(tx_{1}+(1-t)x_{2}\right)\leq tf\left(x_{1}\right)+(1-t)f\left(x_{2}\right)$

Based on this formal definition of Convexity - I had the following question:

  • From this "formal definition" of Convexity, how do we recognize the "informal definition" of Convexity? Just by looking at this mathematical inequality - how exactly is this inequality indicating that a Convex Function always has only one minimum point?

Thank you!

2

There are 2 best solutions below

3
On

A convex function, even informally, is not a function that has one and only one minimum. The exponential function is convex, but it has no minimum at all. Constant functions have infinitely many minima and are convex. And the function $f:\mathbb R\to\mathbb R, f(x)=x(x-1)^3$ (see here) has only one minimum, but is not convex.

The essence of a convex function, even informally, is basically exactly what the formal definition describes: A function is convex if for any two points $A$ and $B$ on the graph, none of the points on the graph between $A$ and $B$ are above the secant through $A$ and $B$. That's what convexity is. And the formal definition encompasses exactly that: $(t,tf(x_1)+(1-t)f(x_2))$ parametrizes the secant between the points $A(x_1,f(x_1))$ and $B(x_2,f(x_2))$, while $(t,f(tx_1+(1-t)x_2))$ parametrizes the graph of $f$. The formal definition just says that the graph of $f$ is not above the secant (at least not between the points $A$ and $B$).

2
On

I think that there is no guarantee of existence of minima. The only thing that convexity says is that LOCAL minima are also GLOBAL minima! That is all! Which is easy to prove. Let $S(x_{0},\epsilon )$ be a neighborhood of $x_{0}$ such that $f(x)\geq f(x_{0})$ for all ponts in S. i.e. there is a local minimum at $x_{0}$. Let y be any point of the domain of f. Then $x_{0}+\lambda (y-x_{0})\in S$ for λ sufficiently small. That implies $f(x_{0})\leq f(x_{0}+\lambda (y-x_{0}))=f((1-\lambda )x_{0}+\lambda y)\leq (1-\lambda )f(x_{0})+\lambda f(y)$ which gives $f(x_{0})\leq f(y)$ i.e. $f(x_{0})$ is a GLOBAL minimum!