Intuitive explanation of why a chord of a convex function has to be a straight line

927 Views Asked by At

I was trying to understand the definition of convexity better.

A simple definition of convexity is:

$$f(tx_1 + (1 -t)x_2) \leq tf(x_1) + (1-t)f(x_2)$$

$\forall x_1,x_2 \in Domain(f)$

Intuitively, one way to think about this definition is that if we pick any two points on the graph of a convex function and draw a straight line between then, then the portion of the function between these two points will lie below this straight line.

The equation of this (straight) line that connects the two point is given by:

$$y = tf(x_1) + (1-t)f(x_2)$$

However, for me, it was not 100% intuitive why this line has to be straight and linear. Why does it have to be of the form $y =mx + c$?

It is clear that any weighted average (by $t$ and $(1-t)$) of $f(x_1)$ and $f(x_2)$, has to be in the range $[f(x_1),f(x_2)]$ because its a weighted average, however, it is not clear that it has to be a straight line.

The one way that I tried to convince myself of this was by verifying that the gradient of the line defined by $y = tf(x_1) + (1-t)f(x_2)$ was indeed $\frac{f(x_2) - f(x_1)}{x_2 - x_1}$. So I grabbed the beginning point and the end and tried to calculate the slope:

$$\frac{y_{end} - y_{start}}{x_{end} - x_{start}} = \frac{tf(x_1) + (1-t)f(x_2) - f(x_1)}{tx_1 + (1 -t)x_2 - x_1} = \frac{(1-t)f(x_2) - (1-t)f(x_1)}{(t-1)x_2 - (1-t)x_1} = \frac{f(x_2) - f(x_1)}{x_2 - x_1}$$

Which verifies what I wanted.

However, I had to go through a lot of easy, but annoying, algebra to do this. I was thinking, the statement of convexity is always told to me as something that is obvious or simple. Since for me its not obvious that this line $tf(x_1) + (1-t)f(x_2)$ has to be a straight line, it makes me feel that either, I am not thinking it or understanding the statement the right way or I just don't understand the statement well enough.

Does someone have a clean/simple/intuitive way to explain why that equations describes the desired chord that lines above the graph?

1

There are 1 best solutions below

4
On BEST ANSWER

Using some algebra you can simplify:

\begin{eqnarray} y &=& tf(x_1) + (1-t)f(x_2) \nonumber \\ &=&t(f(x_1)-f(x_2)) + f(x_2) \end{eqnarray}

so you have the form $y = at + b$ where $a=f(x_1)-f(x_2)$ and $b=f(x_2)$ since $f(x_1)$ and $f(x_2)$ are constants.