What do the coefficients $\lambda$ and $1- \lambda$ represent in the convexity condition of $f$?

3k Views Asked by At

I am trying to understand why the formulation $\lambda f(x_1)+(1-\lambda)f(x_2)$ should be greater than $f[\lambda x_1+(1-\lambda)x_2]$ and what does it mean geometrically.

Convexity condition of $f$: $$f[\lambda x_1+(1-\lambda)x_2]\leq\lambda f(x_1)+(1-\lambda)f(x_2)\quad\forall \space 0 < \lambda < 1$$

5

There are 5 best solutions below

5
On BEST ANSWER

The idea is that the value of $f$ at a point between $x_1$ and $x_2$ is less (or equal) than the value at the same point for the line segment between $f(x_1)$ and $f(x_2)$.

enter image description here

(credit Wikipedia)

The expression $\lambda x_1+(1-\lambda)x_2$ is just a parametrization for all the points between $x_1$ and $x_2$ on $x$ axis and $\lambda f(x_1)+(1-\lambda)f(x_2)$ is the corresponding parametrization for the line segment between $f(x_1)$ and $f(x_2)$.

The concept can be generalized for more points by Jensen's inequality.

0
On

The right hand side is a parameterisation of the straight line between $f(x_1)$ and $f(x_2)$. The left hand side is the point on the function with the same $x$-value as the point on the straight line on the right hand side. So this says that the straight line between any two points lies entirely above the function. Equivalently, it says that $\{(x,y)|y \geq f(x)\}$ is a convex set, in the usual sense.

0
On

The intuition is that when a function is "really" convex, for each two points $(x_,f(x))$ and $(y,f(y))$ the corresponding connecting line segment lies above the function between those two points which is a direct intuition of convexity . $0<\lambda<1$ means in fact the interior of the interval between the two points.

2
On

You can see a convex function as "always turning left", so that it cannot meet a straight line more than twice.

Your equation describes the curve and a chord between two points, and expresses that they do not intersect.

0
On

N.B. I have posted a copy of this answer for the question Definition of convexity, which is essentially the same, though the answer given did not cover the background as I do here.

The idea of convexity is is applicable in the first place to shapes or their surfaces and means bulging with no dents. This concept can be applied when the shape is a set of points in a space for which we can define a “dent”; Euclidean spaces will do. It can also apply to part of the surface with no dents.

We can think of a dent as a place where you can draw a straight line segment joining two points in the set but leaving the set somewhere along that segment. If the set is “well-behaved” and has a surface, such a segment leaves the set at some point and re-enters it another, there is a subsegment joining points on the surface. In this case, we may define convex by saying all points on such segments lie in the set.

Derived from that, a function is described as convex when the set of points above (or maybe below) of its graph is convex. Note that a function may be convex upwards or downwards, with the unqualified form meaning “convex downwards”. Further, as in your case, we call a function convex on an interval if the set of points above the graph with $x$ in that interval is convex.

The formulation with $λ$ and $1-λ$ formalises the above definition for the case of a function, that all points on a segment between points on the graph lie in the set: one side gives the value of the function $λ$ of the way along $[x_1,x_2]$, the other, the point that far along the segment joining two points on the line; the inequality says the point on the segment is above the graph, i.e. in the set.