Gaining intuition for a property of linear functions: why is $f((1-\lambda)a+\lambda b) \equiv (1-\lambda)f(a)+\lambda f(b)$?

186 Views Asked by At

I am told that if we have some linear function $f$ defined over an interval $[a,b]$, then the fact that $f$ is linear implies that, for all $\lambda$ between 0 & 1 exclusive, the following property holds: $$f((1-\lambda)a+\lambda b) \equiv (1-\lambda)f(a)+\lambda f(b)$$

Why is this the case? How can I see that the two expressions are equivalent to each other? My issue isn't in understanding what's being said here, but rather in understanding why it's true.

For context, I am trying to understand the definition of concave & convex functions, and this property is given as a minor step in the build up towards the definition, with no further elaboration.

EDIT: In the answer below I am told that this property is taken as the definition of a linear function.. but in the resource I'm using, it tells me that it is due to $f$ being a linear function that this property holds, so I feel quite confused.

Wouldn't it be possible to show that this property is implied by a more immediately intuitive definition? Am I thinking about this in the wrong way? How should I view this property/definition?

Any help in clearing up my confusion would be greatly appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

Equivalently, for all values of $a$ and $b$ we have $$f\left((1−\lambda)a + \lambda b\right) = (1 − \lambda)f(a) + \lambda f(b) \qquad\text{for all } \lambda \in (0, 1)$$

That is, a function is both concave and convex if and only if it is linear (or, more properly, affine), taking the form $f(x) = \alpha + \beta x$ for all $x$, for some constants $\alpha$ and $\beta$.

Using "linear" for "affine", as the text does, the "if and only if" follows by double implication.

  • If $f(x) = \alpha + \beta x$, then $f\left((1−\lambda)a + \lambda b\right) = (1 − \lambda)f(a) + \lambda f(b)$ follows for all $\lambda \in (0, 1)$: $$ \require{cancel} \begin{align} f\left((1−\lambda)a + \lambda b\right) &= \alpha + \beta \left((1−\lambda)a + \lambda b\right) \\ &= \alpha \cdot \left(\color{red}{(1-\lambda)} + \color{blue}{\lambda}\right) + \color{red}{\beta \cdot (1-\lambda) a} + \color{blue}{\beta \cdot \lambda b} \\ &= \color{red}{(1-\lambda)\cdot(\underbrace{\alpha + \beta a}_{\color{black}{=\,f(a)}})} + \color{blue}{\lambda (\underbrace{\alpha + \beta b}_{\color{black}{=\,f(b)}})} \\ &= (1-\lambda) f(a) + \lambda f(b) \end{align} $$

  • If $f\left((1−\lambda)a + \lambda b\right) = (1 − \lambda)f(a) + \lambda f(b)$ for all $\lambda \in (0, 1)$, then $f(x) = \alpha + \beta x$ follows for all $x \in [a,b]$. Let $\lambda = \frac{x-a}{b-a} \iff x = (1-\lambda)a + \lambda b$, then: $$ \begin{align} f(x) = f\left((1−\lambda)a + \lambda b\right) &= (1 − \lambda)f(a) + \lambda f(b) \\ &= \left(1 - \frac{x-a}{b-a}\right) f(a) + \frac{x-a}{b-a} f(b) \\ &= \frac{b-x}{b-a}f(a) + \frac{x-a}{b-a}f(b) \\ &= \underbrace{\frac{bf(a)-af(b)}{b-a}}_{=\,\alpha} + \underbrace{\frac{f(b)-f(a)}{b-a}}_{=\,\beta}\,x \\ &= \alpha + \beta x \end{align} $$

The fact that $h_{a,b}$ is linear means that $$h_{a,b}\left((1 − \lambda)a + \lambda b\right) = (1 − \lambda)h_{a,b}(a) + \lambda h_{a,b}(b)$$ for any value of $\lambda$ with $0 \le \lambda \le 1$.

This is precisely the "if" part proved at the previous step, just with $h_{a,b}$ instead of $f$.

7
On

My personal preference for this, especially since you're about to cover convexity/concavity, is to think of

$$(1-\lambda) a + \lambda b$$

as being $(100\cdot \lambda)\%$ of the way from $a$ to $b$, and likewise for the outputs $f(a)$ and $f(b)$. This is justified as the standard parameterization of the line segment from $a$ to $b$ is given by the above expression with $\lambda \in [0,1]$, and most famously with $\lambda = \frac 1 2$ (giving you the midpoint). (You can play with this in a demo here, working in two dimensions; the only difference is that, in higher dimensions, we apply the parameterization to each direction.)

What linearity means for functions, then, is that being $(100\cdot \lambda)\%$ of the way from $a$ to $b$, means that the output $f((1-\lambda)a+\lambda b)$ is likewise $(100\cdot\lambda)\%$ of the way between $f(a)$ and $f(b)$.

You can likewise think of linear functions, under your definition, as being those functions $f : [a,b] \to \mathbb{R}$ such that, whenever $x<y$ and $x,y \in [a,b]$, then the line segment from $f(x)$ to $f(y)$ coincides with the graph of $f$. (For convex functions, that line segment will always lie on or above the graph, and for convex functions it will lie below.) Perhaps this demo will prove useful for that; as examples of each type of function:

  • Linear: $f(x)=ax$ for any constant $a$
  • Convex: $f(x)=x^2$, $f(x) = a^x$ for $a>0$
  • Concave: $f(x)=\sqrt x$, $f(x) = \ln(x)$

If you remember your Calculus I material, the notion of "convex" here generalizes that of "concave up" functions (where you had $f'' > 0$), and "concave" functions are those that were "concave down" (where you had $f''<0$).