Why is it wrong to derive the chain rule this way?

1.7k Views Asked by At

My book says that the chain rule can stated as $$\dfrac{dy}{dx} = \dfrac{dy}{dt} \dfrac{dt}{dx}$$

However, it the book says that it is incorrect to reason that the chain rule is true because the $dt$'s cancel out. Why is it incorrect? I've heard people saying that differentials are not really numbers, but I don't know what to think because my book had just finished a section where it defined $dy$ and $dx$ as $(y-\Delta y)$ and $(x-\Delta x)$ respectively, and the book even used values such as .01 to substitute in place of these differentials.

2

There are 2 best solutions below

4
On BEST ANSWER

Differentials are indeed not numbers*. Basically, they represent a change in a quantity which has "already gone to zero". Thus they really only carry meaning in a ratio, where you can have a ratio of two quantities which are both going to zero, yet the ratio can have a finite, nonzero limit.

To prove the chain rule rigorously, you should actually consider two distinct cases. First you should assume $f$ is differentiable at $g(x)$, $g$ is differentiable at $x$, and $g'(x) \neq 0$. Then you can do something which actually looks like the "cancellation of differentials". Specifically you can write:

$$\frac{f(g(x+h))-f(g(x))}{h}=\frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)} \frac{g(x+h)-g(x)}{h}.$$

Now you split into two limits and send $h \to 0$. Because of the additional assumption that $g'(x) \neq 0$, we can actually compute the limit of the first factor, and everything turns out to be OK. (It is not completely automatic, but it is quite doable.)

Second you should separately should prove that if instead $g'(x)=0$, then $(f \circ g)'(x)=0$.

What your book is doing in these problems where differentials are replaced by small finite numbers is approximate. Specifically, part of the point of the derivative is that $\Delta y \approx \frac{dy}{dx} \Delta x$ provided $\Delta x$ is sufficiently small. But this is approximate: we do not have anything like "$dy=\Delta y$". And indeed, we should use "$\approx$" not "$=$", in these problems.

* There are ways to properly handle these infinitesimal quantities as separate entities. Collectively these ways are called nonstandard analysis. For various reasons, none of these approaches are popular.

4
On

There is a subtle problem with the given answer. In

$\frac{f(g(x+h))-f(g(x))}{h}=\frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)} \frac{g(x+h)-g(x)}{h},\ $ the conclusion that

$\frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)}\rightarrow f'(g(x))$ as $h\rightarrow 0$ is not automatic.

Even if $g'(x)\neq 0$, the proof is not trivial because by definition

$f'(g(x))=\lim _{k\rightarrow 0}\frac{f(g(x)+k)-f(g(x))}{k}$ and this is not the same thing as $\lim _{h\rightarrow 0}\frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)}$.

To do this properly, it's a little more fussy. See this easy to follow proof.