Suppose $f\in C([a,b])$ is twice continuously differentiable and $f''(x)>0$ on the interval. Show that the best linear approximation $p$ to $f$ has the slope $p'(x)=(f(b)-f(a))/(b-a)$.
To my understanding, due to the second derivative of $f$ being greater than 0 on the entire interval, then we know for sure $f$ itself has to be concave up on the entire interval. This and the fact that $f$ is concave up tells us that the maximum of $f$ must be in the interval, more specifically, at one of the endpoints. Also, there is a minimum somewhere else in the interval.
Now, by Taylor's theorem, we know there is a Taylor polynomial $p(x)=f(c)+f'(c)(x-c)$ at $c\in [a,b]$. Due to the linearity of $f$, we know it is a polynomial of degree $1$ and so $f$ must resemble $f(x)=mx+b$ where $m,b\in\mathbb{R}$.
But now I am stuck. Looking at $p'(x)=(f(b)-f(a))/(b-a)$, the right side seems to resemble the definition of the derivative of $f$ and intuitively, it makes sense to me that $p$ approximating $f$ will have this matching slope but I cannot seem to prove it. How to proceed?
First we consider the case $f(a) = f(b)=0$. Since $f''>0$, we have $f(x) <0$ and $f(x)$ attains a minimum at some $c\in (a, b)$.
Then it is easy to check that $L(x) = \frac 12 f(c)$ (i.e. constant function) is the best linear approximation with $\|f - L\|_\infty = \frac 12 |f(c)|$ (you need only that $f(a) = f(b) = 0$ and $f(c)\le f(x)\le 0$ to show this).
In the general case, consider
$$ g(x) = f(x) - \frac{f(b)-f(a)}{b-a}(x-a) + f(a).$$
Then $g'' = f'' >0$, $g(a) = g(b) = 0$. Thus $g$ is best approximated by constant function, and thus $f$ is best approximated by linear functions with slope $$\frac{f(b) - f(a)}{b-a}.$$