Why is the following NOT a proof of The Chain Rule?

659 Views Asked by At

In Leibniz notation of the chain rule, $$\frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx}$$

Where $y\left ( u\left ( x \right ) \right )$ is a composite function of x.

I understand that the du's don't simply cancel out because $\frac{dy}{du}$ and $\frac{du}{dx}$ are defined as specific limits making the numerator and denominator infinitesimals and thus making the whole thing indeterminate and inoperable on.

But applying the definition of a derivative, we can express the above like so:

$$\lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} = \lim_{\Delta u\to 0} \frac{\Delta y}{\Delta u} \cdot \lim_{\Delta x\to 0} \frac{\Delta u}{\Delta x}$$

At this point we can't use the Product Law of Limits to combine the two limits on the right.

But if we consider a coordinate system u vs x, doesn't $\Delta u \rightarrow 0$ when $\Delta x \rightarrow 0$ ? And if so, then whenever $\Delta u \rightarrow 0$ we necessarily have $\Delta x \rightarrow 0$.

Then can't we rewrite the above limit equation as:

$$\lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} = \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta u} \cdot \lim_{\Delta x\to 0} \frac{\Delta u}{\Delta x}$$

And then can't we use the Product Law of Limits to say:

$$\lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} = \lim_{\Delta x\to 0} \left ( \frac{\Delta y}{\Delta u} \cdot \frac{\Delta u}{\Delta x} \right )$$

And since $\frac{\Delta y}{\Delta u}$ and $\frac{\Delta u}{\Delta x}$ within the quantity who's limit is being taken are no longer "quotients" infinitesimals, Δu's can cancel, leaving us with:

$$\lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} = \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} $$

Which in Leibniz notations looks like:

$$\frac{dy}{dx} = \frac{dy}{dx}$$

Q.E.D (Since we manipulated the right hand side of the equation into looking the same as the left hand side).

4

There are 4 best solutions below

5
On BEST ANSWER

If $\Delta u=0$ then you have a $0$ in a denominator.

That's not a problem if it happens only when $|\Delta x|>0.000000000001$ since the limit depends only on what happens when $|\Delta x|$ is less than that. And similarly with any other positive number in place of $0.000000000001$.

But now suppose it happens when $|\Delta x|=0.000000000001$ and again when $|\Delta x| =0.000000000001^2$ and again when $|\Delta x|=0.000000000001^3$ and so on, ad infinitum. Then it's a difficulty to be addressed. And what if $\Delta u=0$ for all $\Delta x$ between $\pm 0.001$? Then your proof clearly won't work.

Hence one writes $$ \frac{\Delta y}{\Delta x} = \left.\begin{cases} \Delta y/\Delta u & \text{if } \Delta u\ne0, \\[6pt] dy/du & \text{if }\Delta u = 0, \end{cases} \right\} \cdot \frac{\Delta u}{\Delta x} $$ and one goes on from there. The $\displaystyle \left\{ \begin{array}{c} \text{factor in} \\ \text{braces} \end{array} \right\}$ approaches $dy/du$ as $\Delta x\to 0$ and the second factor, $\dfrac{\Delta u}{\Delta x}$, approaches $du/dx$.

3
On

This is not a proof, because you start with the formula you want to prove and imply a true statement. But a true statement can by implied by a false statement. You would have to start with

$$ \frac{dy}{dx}=\frac{dy}{dx} $$

and then prove

$$ \frac{dy}{dx}= \frac{dy}{du}\cdot\frac{du}{dx} $$

and not the other way around.

1
On

The main problem is that you could have $\Delta u = 0$ for some $x$, which makes $\dfrac{\Delta y}{\Delta u}$ undefined.

0
On

The proof given in question is almost rigorous and correct but written in reverse (as pointed out by Wazul). Moreover this is a very standard proof of the chain rule. Contrary to what many students think this proof is not based on infinitesimals. But we do need to add some details.


Let $u = g(x)$ be differentiable and $y = f(u)$ be also differentiable. Then $y = f(g(x)) = (f \circ g)(x)$ and the Chain Rule says that $$\frac{dy}{dx} = \frac{dy}{du}\cdot\frac{du}{dx}$$ or $$(f\circ g)'(x) = f'(u)g'(x) = f'(g(x))g'(x)$$ The proof in the question needs to be supplemented with proper definitions of $\Delta u, \Delta y$.

We have $\Delta u = g(x + \Delta x) - g(x)$ and $$\frac{du}{dx} = g'(x) = \lim_{\Delta x \to 0}\frac{g(x + \Delta x) - g(x)}{\Delta x} = \lim_{\Delta x \to 0}\frac{\Delta u}{\Delta x}$$ And similarly with $\Delta y = f(u + \Delta u) - f(u)$ we have $$\frac{dy}{du} = f'(u) = \lim_{\Delta u \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u} = \lim_{\Delta u \to 0}\frac{\Delta y}{\Delta u}$$ We then have \begin{align} \frac{dy}{dx} &= (f\circ g)'(x) = \lim_{\Delta x \to 0}\frac{(f \circ g)(x + \Delta x) - (f \circ g)(x)}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(g(x + \Delta x)) - f(g(x))}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(g(x) + \Delta u) - f(g(x))}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta x}\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\frac{\Delta u}{\Delta x}\text{ (assume }\Delta u \neq 0)\notag\\ &= \lim_{\Delta x \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\frac{g(x + \Delta x) - g(x)}{\Delta x}\notag\\ &= \lim_{\Delta u \to 0}\frac{f(u + \Delta u) - f(u)}{\Delta u}\cdot\lim_{\Delta x \to 0}\frac{g(x + \Delta x) - g(x)}{\Delta x}\notag\\ &= f'(u)g'(x)\notag\\ &= \frac{dy}{du}\cdot\frac{du}{dx} \end{align} We have assumed in the above that $\Delta u \neq 0$ when $\Delta x \to 0$. Also by continuity of $u = g(x)$ (note that differentiability implies continuity) we have $\Delta u \to 0$ as $\Delta x \to 0$.


The above argument fails when $\Delta u = g(x + \Delta x) - g(x)$ vanishes for infinitely many values of $\Delta x$ as $\Delta x \to 0$. In this case we we have $du/dx = g'(x) = 0$. Why? Because if $g'(x) \neq 0$ then the ratio $\Delta u / \Delta x \neq 0$ for all small values of $\Delta x$ and hence $\Delta u \neq 0$ for all small values of $\Delta x$.

Hence if $\Delta u = 0$ for infinitely many small values of $\Delta x$ then $du/dx = g'(x) = 0$. We show that in this case $dy/dx = 0$. Clearly for those values of $\Delta x$ for which $\Delta u = 0$ we also have $\Delta y = f(u + \Delta u) - f(u) = 0$ so that the ratio $\Delta y/\Delta x = 0$. For values of $\Delta x$ where $\Delta u \neq 0$ we know that $\Delta y/\Delta u$ is bounded (because the derivative $dy/du$ exists) and the ratio $\Delta u/\Delta x$ can be made arbitrarily small (because its limit is $du/dx = 0$). Hence the overall product $$\frac{\Delta y}{\Delta x} = \frac{\Delta y}{\Delta u}\cdot\frac{\Delta u}{\Delta x}$$ can be made arbitrarily small by choosing $\Delta x$ sufficiently small. It follows that $\Delta y / \Delta x$ can be made arbitrary small for all sufficiently small values of $\Delta x$. It follows that $$\frac{dy}{dx} = \lim_{\Delta x \to 0}\frac{\Delta y}{\Delta x} = 0$$ and therefore the chain rule holds true in this case also.

Note that most common textbooks of calculus omit the discussion of the case when $\Delta u = 0$. Also it is much better to type and understand if $\Delta x$ is replaced by $h$ and $\Delta u$ is replace by $k$. However I have tried to stick to the notation used by OP.