I do understand that differentiating a linear function (for a maximization) subject to some linear restriction (such as the problem $p=ax+by$ s.t. $cx+dy \leq m$) won't necessarily give me the right answer and that we need to try the corners of the feasible set, but I would like a formal explanation of why we cannot differentiate to find the optimum solution.
Linear Programming and differentiation, why can't we differentiate to find the optimum solution?
2.6k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
Linear functions are monotonically increasing or monotonically decreasing (their derivative equals their slope, hence it is a constant). Excluding trivial cases (where for example the slope is equal to zero) they do not have critical points, where they can attain maxima or minima. As already said they either increase or decrease. Therefore we only need to check on the boundary of the set that is defined by the given constraints to determine the maximum (minimum) of a linear function.
On
Take your problem $$\begin{eqnarray} ax+by &=& p \\ cx+dy &\leq& m \\ x &\geq& 0 \\ y &\geq& 0 \\ \end{eqnarray}$$ where the goal is to maximize $p$ under the specified constraints. (I added $x,y \geq 0$ - without these constraints, there is no solution)
$p$ has the constant gradient $p' = \left(\frac{\partial p}{\partial x}, \frac{\partial p}{\partial y}\right) = (a,b)$, which means that for every point $(x,y)$, you can increase $p$ by increasing $x$ or $y$ (or both). Does that help you to find the maximum of $p$ within the area $cx + dy \leq m$? No! It would, if the gradient where zero somewhere in the interior of that area, but it isn't. Thus, all you have learned is that the maximum is attained somewhere on the border of the area $cx + dy \leq m$. The same is true for every other linear functional $p$ - the gradient of every linear functional is constant (that's a defining property of being linear, after all), so you'll never find zeros of $p'$ within the interior of the area of interest - they'll always lie on the border.
So let's try something different. We know that the maximum (if there is any!) is obtained somewhere on the line segment described by $cx + dy = m$, $x \geq 0$, $y \geq 0$ . Rewrite that as $y(x) = d^{-1}(m - cx)$, and use that to rewrite $p$ as $$ p = ax + bd^{-1}(m - cx) = (a - bcd^{-1})x + bd^{-1}m \text{.} $$ This describes $p$ for points on the line segment $cx + dy = m$, $x \geq 0$, $y \geq 0$ as $x$ varies. Now look at $\frac{dp}{dx} = a - bcd^{-1}$. Just as before, the derivative is constant! So, not only does $p$ attain it's maximum somewhere on the line segment $cx + dy = m$, $x \geq 0$, $y \geq 0$, it attains it one of the endpoints of that line segment. And again, the same happens for every linear programming problem. The utility function always attains the maximum on the edges of the allowed region!
Thus, differentiation never helps to find the maxima! But on the other hand, we just learned that we only have to search the edges of the allowed region, not the whole region. Which, incidentally, is exactly what the simples algorithm does - it searches along the edges, until it finds a maximum. In your case, there are only two edges - $(0, \frac{m}{d})$ and $(\frac{m}{c}, 0)$, so it suffices to evaluate $p$ at those two points, and check which one yields a larger value.
It is the same as in 1D that you have to check the ends of an interval as well as points where the derivative is zero. The corners are the "ends of the interval" A linear function cannot have an isolated maximum, so the maximum has to be at one of the corners.