Chain rule proof - Apostol

594 Views Asked by At

Apostol calculus I page 174-175 has the proof of chain rule.

Theorem states: Let f be the composition of two functions u and v, say $f=u \circ v$. Suppose that both derivatives $v'(x)$ and $u'(y)$ exist, where $y=v(x)$. Then derivative $f'(x)$ also exists and is given by the formula $f'(x)=u'(y).v'(x)$.

Proof: The difference quotient for f is (4.12): $\frac{f(x+h)-f(x)}{h}=\frac{u[v(x+h)]-u[v(x)]}{h}$ . Let $y=v(x)$ and let $k=v(x+h)-v(x)$. Then we have $v(x+h)=y+k$ and (4.12) becomes (4.13): $\frac{f(x+h)-f(x)}{h}=\frac{u(y+k)-u(y)}{h}$ .

If $k\neq0$,then we multiply and divide by k and obtain (4.14): $\frac{u(y+k)-u(y)}{h}\frac{k}{k}=\frac{u(y+k)-u(y)}{k}\frac{v(x+h)-v(x)}{h}$. When h goes to 0, last quotient on right becomes $v'(x)$. Also, as $h$ goes to $0$, $k$ also goes to $0$ because $k=v(x+h)-v(x)$ and $v$ is continuous at $x$. Therefore the first quotient on the right approaches $u'(y)$ as $h$ tends to zero and this proves the result. $\square$


Although the foregoing argument seems to be the most natural way to proceed, it is not completely general. Since $k=v(x+h)-v(x)$, it may happen that $k=0$ for infinitely many values of $h$ as $h$ tends to zero in which case the passage from (4.13) to (4.14) is not valid.


My doubt: I have trouble understanding the line "it may happen that $k=0$ for infinitely many values of $h$ as $h$ tends to zero" What is this line trying to convey and why is the proof incorrect?

Thanks in advance.

2

There are 2 best solutions below

1
On

Apostol has in mind functions like the topologist's sine curve $$t(x) = \sin \left( \frac1x \right)$$ While this function itself is not differentiable at zero, so it is not problematic for the chain rule proof, its weird cousin $$f(x) = e^{-\frac1{x^2}} \sin \left( \frac1x \right)$$ is differentiable everywhere (though it is not analytic at $x=0$).

So the question becomes, "does the chain rule apply if one of the functions is a weird function such as $f(x)$?"

Not a very practical worry, but if you present a "proof" it is always best that the proof be airtight.

Added afterward

Let $g(x) = \frac1{x+1}$. Then $$(f\circ g)(x) = e^{-(1+x)^2}\sin(1+x) \\\frac{d(f\circ g)(x))}{dx} = e^{-(1+x)^2}\cos(1+x)-2e^{-(1+x)^2}(1+x)\sin(1+x)\\ \left.\frac{d(f\circ g)(x))}{dx}\right|_{x=0} = \frac{\cos(1)-2\sin(1)}{e} \approx -0.42 \neq 0 $$ But applying the chain rule, and noting that the derivative at zero of $f(x)$ is zero,

$$ \left.\frac{df(x)}{dx}\right|_{x=0} = 0 \\ \left.\frac{dg(x)}{dx}\right|_{x=0} = -1 \frac{d(f\circ g)(x))}{dx} = 0\cdot (-1) = 0 $$

But the combination of $f$ and $g$ is not a counterexample to the chain rule, because the chain rule requires taking the derivative of $f$ at $g(x)$ and $g(0)$ is not zero.

Turns out the conditions stated in Apostol are in fact sufficient; as long as the functions are differentiable, at $g(x)$ and $x$ respectively, the chain rule works.

2
On

It is a common problem with proof of chain rule in calculus textbooks. It is great that your book mentions about the problem when $k=0$. Thus the given proof does not work when $v(x)= x^{2}\sin(1/x)$ and $v(0)=0$ and point of consideration is $x=0$. You should observe that $k=v(h)-v(0)=h^{2}\sin(1/h)$ vanishes at points $h=1/n\pi$ for all non zero integers $n$. This kind of behavior is what is meant by the line "$k=0$ for infinitely many values of $h$ as $h$ tends to $0$."

The proof given in your book is therefore incomplete and should handle the case when $k=0$. It might be harsh to call the proof incorrect, but rather I would term it as partial/incomplete.

However it is easy to salvage the proof when $k$ vanishes infinitely many times as $h$ tends to $0$. The thing to note is that in this case $v'(x)=0$ and we need to show that $f'(x)=0$. When $k=0$ then $f(x+h)-f(x)=0$ and if $k\neq 0$ then the ratio $(f(x+h)-f(x))/h$ can be made arbitrarily small because $v'(x)=0$. Hence $f'(x)=0$.