Idea behind the definition of convex functions

460 Views Asked by At

Recently, I had to work on a lot of problems concerning convex functions (e.g. showing they are continuous on an open interval ).

This definition is always given:

A function $f:X \rightarrow R$ $$\forall x_1, x_2 \in X, \forall t \in [0, 1]: \qquad f(tx_1+(1-t)x_2)\leq t f(x_1)+(1-t)f(x_2)$$

And I can work on it, but I have no idea where it comes from. I know what a convex graph looks like, but how did they come up with that formula? What is the motivation behind that $\forall t \in [0, 1]$ that pops up everywhere when talking about convex functions?

On Wikipedia there even is an "illustration", but I can't make sense of it either. https://en.wikipedia.org/wiki/Convex_function#/media/File:ConvexFunction.svg

Could somebody please explain, how somebody looked at a convex graph one time and came up with this definition for a convex function?

1

There are 1 best solutions below

1
On BEST ANSWER

To answer the first question. Where does $\forall t\in [0,1]$ come from and what does it mean:

Basically it is a means to indicate a "path" between two states where $f(0)$ is the first state and $f(1)$ is the end state and $f(t)$ is the state and any "time" in between.

The simplest example is a line segment between two points $a, b$ in $\mathbb R^n$. Let $f(t) = a + t(b-a) = a(1-t) + t*b$. Note: $f(0) = a$ the point at "time" 0. And $f(1) = b$ the point at "time" 1. And $f(t)$ is the position at proportion $t$ of the line. (e.g. midpoint of the line is $f(.5) = \frac {a+b}2$; quarter point is $f(.25) = \frac {3a + b}4$; etc.)

Here's a thought experiment: Suppose we had a curve. The graph of $y =f(x) = x^2 + 3x + 5$. And suppose we had another curve. The graph of $y =g(x)= x^3-27$. And we wanted to "bend" one curve into the other and we wanted to describe the process of bending the curve so we can describe all the bent curves in between. How would we do that?

Well we use the idea of "paths". At time $0$ we will have $y(0) = f(x) =x^2 + 3x + 5$ and at time $1$ we will have $y(0) = g(x) = x^3 - 27$. And at time $t$ we have $y(t) =f(x)(1-t) + g(x) t = (x^2 + 3x + 5)(1-t) + t(x^3 - 27)$. We think of that as a continuous "path" function $H:[0,1]\times \mathbb R \rightarrow \mathbb R$ the notation of which confused the HECK out of me when I first encountered it. But all it means is we have a bunch of functions $H(t,x) =y_t(x) = whatever$ and this is classifying them all so we can access each one by index $t$ where each $y_t(x)$ represents some function "in between" $f(x)$ and $g(x)$.

Okay... I digress.

But this is useful.

Okay... what does "convex" mean? It means "bowed". What does that mean. It means that if you picked any two points on the graph and drew a line between them, all the points on the convex curve will be to one side (right) of the line. Or you can never draw a from any point of the curve to any other point of the curve and have to curve intersect.

How to define that mathematically?

Well, you pick any two points of the curve. $(x_1, f(x_1))$ and $(x_2, f(x_2))$ and draw a line between them. Meanwhile we also have the curve between those two points.

Suppose there were an ant crawling along the curve traversing the horizontal direction at a constant speed, while at the same time we have a spider crawling along the line also traversing the horizontal direction at the exact speed. i.e. The spider and the fly will always be vertically directly above or below or at the same spot as each other.

As the curve is convex the spider, on the line, will always be above the ant, on the curve.

The spider, traveling in a line will at time $t$ be at $(x_1(1-t) + x_2*t, f(x_1)(1-t) + f(x_2)t)$.

The ant traveling along the curve will at time $t$ be ate $(x_1(1-t) + x_2*t, f(x_1(1-t) + x_2*t))$.

As the spider is above the ant:

$f(x_1)(1-t) + f(x_2)t \ge f(x_1(1-t) + x_2*t)$