Why is this proof of the chain rule incorrect?

3.2k Views Asked by At

enter image description here

I saw this proof of the chain rule but it says this is a flawed proof. Why? I guessed the reason it is wrong because you can't substitute $g(x+h)$ and $g(x)$ into in limit.

3

There are 3 best solutions below

0
On

I see at least two problems:

  1. in the proof you are dividing by $g'(x)$, which would require that $g'(x) \ne 0$ which is not always true;
  2. in the equation $$(f \circ g)'(x)\cdot\frac{1}{g'(x)}=\lim_{h \to 0} \left(\frac{f(g(x+h))-f(g(x))}{h}\right)\cdot\left(\frac{h}{g(x+h)-g(x)}\right) $$ you are assuming that $\lim_{h \to 0}\left(\frac{f(g(x+h))-f(g(x))}{h}\right)$ indeed exists, usually ones prove this by proving that $(f\circ g)'(x)$ indeed such limit, but maybe you can prove the existence of the limit by other means.

Hope this helps.

4
On

To expand on my comment, the fundamental issue is that $g(x+h) - g(x)$ may vanish in any neighbourhood around $h=0$. The issue of $g'(x)$ being $0$ (though certainly a mistake in the proof) is not that important, since this "proof" can be trivially modified so that the $g'(x)$ term stays on the right hand side. For example,

$$\frac{f(g(x+h))-f(g(x))}{h} = \frac{f(g(x+h))-f(g(x))}{g(x+h) - g(x)} \cdot \frac{g(x+h) - g(x)}{h} \ \ \ \ \ \ \ \ \ (1)$$

and let $h \to 0$ in the equation. The problem, again, is that we may have $g(x+h) = g(x)$ in every neighbourhood of $x$ for certain badly behaved functions (ex. $g(t) = t^2\sin \frac 1t$, $x=0$ ).

The trick (credit to Michael Spivak) is as follows. Define $$\sigma(h) = \begin{cases} f'(g(x)) & g(x+h) = g(x) \\ \ \frac{f(g(x+h))-f(g(x))}{g(x+h) - g(x)} & \text{otherwise} \end{cases}$$

and note that as $h \to 0$, this tends to $f'(g(x))$ without any division by zero problems. Now, substitute $\sigma(h)$ for the first fraction on the RHS in $(1)$ and let $h \to 0$. The substitution is justified because the equality in the modified version of $(1)$ will always hold (can you see why?).

0
On

I must say it is a very weird proof (or perhaps a weird attempt at proof of chain rule) and it does not really capture the essence of chain rule which says that:

Chain Rule: If $g$ is differentiable at $a$ and $f$ is differentiable at $g(a)$ then $f\circ g$ is differentiable at $a$ and $(f\circ g)'(a) = f'(g(a))g'(a)$.

The proof presented in your post assumes that $(f\circ g)'(a)$ exists (whereas this is the conclusion and not hypotheses of chain rule). Another problem is that it assumes $g'(a) \neq 0$ which is sort of a very artificial restriction. So the proof given in your post is not about chain rule, but rather it is a proof of the following result (which can at best be called a simpler version of chain rule):

Theorem: If $g$ is differentiable at $a$ with $g'(a) \neq 0$ and $f$ is differentiable at $g(a)$ and $f\circ g$ is differentiable at $a$ then $(f\circ g)'(a) = f'(g(a))g'(a)$.


BTW I hope your book has given a proper proof of the chain rule and is then comparing it with one of the many flawed proofs available in calculus textbooks. If not then you need to consider two cases in proof of chain rule: 1) when $g'(a) \neq 0$ and 2) when $g'(a) = 0$.

The first case is easy. The fact that $$g'(a) = \lim_{h \to 0}\frac{g(a + h) - g(a)}{h}\neq 0$$ means that there is a value $\delta > 0$ such that $$\frac{g(a + h) - g(a)}{h}\neq 0$$ for all $h$ with $0 < |h| < \delta$. This ensures that $g(a + h) - g(a) \neq 0$ for $0 < |h| < \delta$.

Now we have \begin{align} (f\circ g)'(a) &= \lim_{h \to 0}\frac{f(g(a + h)) - f(g(a))}{h}\notag\\ &= \lim_{h \to 0}\frac{f(g(a + h)) - f(g(a))}{g(a + h) - g(a)}\cdot\frac{g(a + h) - g(a)}{h}\notag\\ &= \lim_{k \to 0}\frac{f(g(a) + k) - f(g(a))}{k}\cdot\lim_{h \to 0}\frac{g(a + h) - g(a)}{h}\text{ (putting }k = g(a + h) - g(a))\notag\\ &= f'(g(a))g'(a)\notag \end{align}

If $g'(a) = 0$ then we need to establish that $(f\circ g)'(a) = 0$. Let $\epsilon > 0$ be given. We will find a $\delta > 0$ such that $$\left|\frac{f(g(a + h)) - f(g(a))}{h}\right| < \epsilon$$ for $0 < |h| < \delta$. Since $f'(g(a))$ exists it follows that the ratio $$\frac{f(g(a) + k) - f(g(a))}{k}$$ is bounded for all sufficiently small values of $k$. To put rigorously, there exist real numbers $M > 0, \delta_{1} > 0$ such that $$\left|\frac{f(g(a) + k) - f(g(a))}{k}\right| < M\tag{1}$$ for all $0 < |k| < \delta_{1}$.

Further $g'(a) = 0$ means that there is a $\delta > 0$ such that $$|g(a + h) - g(a)| < \delta_{1}, \left|\frac{g(a + h) - g(a)}{h}\right| < \frac{\epsilon}{M}\tag{2}$$ for all $h$ with $0 < |h| < \delta$.

Let $k = g(a + h) - g(a)$. If $k = 0$ then $f(g(a + h)) - f(g(a)) = 0$ so that $$\left|\frac{f(g(a + h)) - f(g(a))}{h}\right| < \epsilon$$ trivially and if $k \neq 0$ then using $(1)$ and $(2)$ we see that \begin{align} \left|\frac{f(g(a + h)) - f(g(a))}{h}\right| &= \left|\frac{f(g(a) + k) - f(g(a))}{k}\right|\cdot\left|\frac{g(a + h) - g(a)}{h}\right|\notag\\ &< M \cdot\frac{\epsilon}{M} = \epsilon\notag \end{align} for all values of $h$ with $0 < |h| < \delta$. We have thus established the chain rule for the case when $g'(a) = 0$.