Why is the "correct" proof of the chain rule correct? What is actually happening here?

Question

Why is the "correct" proof of the chain rule correct? What is actually happening here?

3.7k Views Asked by Bumbble Comm At 04 Apr 2026 - 10:17

There is a correct and an incorrect proof going around when it comes to the Chain Rule (see below). The problem with the incorrect proof is that $g(x)-g(a)$ might be $0$ if $x\to a$ creating a division by zero.

Question

I can't get my head around why the correct proof solves the problem of the incorrect proof. Why can we just define a function $E$ and suddenly all our problems disappear?

I just don't really get what actually happens in the correct proof. It just didn't "click" in my brain yet. Any help would be much appreciated.

By the way; is my "correct proof" below indeed correct?

Incorrect proof:

$$\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{x-a}=\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{g(x)-g(a)}\times\frac{g(x)-g(a)}{x-a}=f'(g(x))g'(x)$$

Correct proof:

We first define a function $E$

$$E(0)=0$$ $$E(g(x)-g(a))=\frac{f(g(x))-f(g(a))}{g(x)-g(a)}-f'(g(x))$$

In any case: $$f(g(x))-f(g(a))=(E(g(x)-g(a))+f'(g(x)))\times(g(x)-g(a))$$

Dividing by $x-a$ and taking the limit we get:

$$\begin{align} \frac{d}{dx}f(g(x))&=\lim \limits_{x \to a}\frac{f(g(x))-f(g(a))}{x-a}\\ &=\lim \limits_{x \to a}(E(g(x)-g(a))+f'(g(x)))\times\frac{g(x)-g(a)}{x-a}\\&=f'(g(x)g'(x) \end{align}$$

EDIT: In other words: we basically state that when $g(x)=g(a)$:

$$\frac{f(g(x))-f(g(a))}{g(x)-g(a)}-f'(g(x))=0$$

But why can we state that? As I understand it, this is true for the limit, but why are we allowed to also state it for the actual value?

Original Q&A

There are 5 best solutions below

Bumbble Comm On 26 Oct 2017 - 9:41

Note that we get into trouble with $\frac{f(g(x))-f(g(a))}{g(x)-g(a)}$ when $g(x) = g(a)$. However, as a function of $x$, it has well-defined limit at all those points, namely $f'(g(a))$. So what they do when introducing $E$ is simply "filling in" those holes so that we get an expression that is valid for all $x$. We could just as well have said

Consider the expression which is $$ \frac{f(g(x))-f(g(a))}{g(x)-g(a)}\times\frac{g(x)-g(a)}{x-a} $$ when $g(x) \neq g(a)$, and $$ f'(g(a))\times \frac{g(x)-g(a)}{x-a} $$ when $g(x) = g(a)$, and take its limit when $x\to a$.

and this would've been more or less the exact same thing.

Bumbble Comm On 26 Oct 2017 - 10:03

Here is a "correct" proof:

From the usual definition of the derivative one immediately deduces the following

Lemma. A function $f$ is differentiable at the point $a$ with $f'(a)=A$ iff there is a function $m_{f,a}=:m$, continuous at $a$ with $m(a)=A$, such that for all $x$ one has $$f(x)-f(a)=m(x)(x-a)\ .$$

Under the hypotheses of the chain rule one therefore has $$f\bigl(g(x)\bigr)-f\bigl(g(a)\bigr)=m_{f,g(a)}\bigl(g(x)\bigr)\bigl(g(x)-g(a)\bigr)=m_{f,g(a)}\bigl(g(x)\bigr)m_{g,a}(x)(x-a)\ .$$ Since $g$ is continuous at $a$ the product $x\mapsto m_{f,g(a)}\bigl(g(x)\bigr)m_{g,a}(x)$ is continuous at $a$ as well, and takes the value $f'\bigl(g(a)\bigr)g'(a)$ there. By the reverse direction of the Lemma the chain rule follows.

Bumbble Comm On 27 Oct 2017 - 6:29

You can avoid the "correct" proof this way:

Case 1: $g'(a) \ne 0.$ Here the "fake proof" works! That's simply because $(g(x) - g(a))/(x-a)$ is nonzero for $x$ close to, but not equal to, $a.$ For such $x,$ we have $g(x)\ne g(a),$ and now the fake news is actually news.

Case 2: $g'(a) = 0:$ Because $f'(g(a))$ exists, there exists a constant $c>0$ and a $\delta > 0$ such that

$$\tag 1 |f(y)-f(g(a))|\le c|y-g(a)|\, \text { for } y\in (g(a)-\delta, g(a)+\delta).$$

Now $g$ is continuous at $a,$ so there exists $\gamma > 0$ such that $x\in (a-\gamma, a + \gamma)$ implies $g(x) \in (g(a)-\delta, g(a)+\delta).$ For such $x$ we can use $(1)$ to see

$$|f(g(x))-f((g(a))| \le c |g(x)-g(a)|.$$

Now divide by $|x-a|$ and let $x\to a.$ On the right we get limit $0$ because $g'(a)=0.$ Therefore the limit on the left is $0,$ which is exactly the same as saying $(f\circ g)'(a) = 0.$ That is the desired conclusion in this case.

Bumbble Comm On 27 Oct 2017 - 10:20

Here is one proof which does not require you to have any special definition for the difference quotient. Consider the ratio $$\frac{f(g(x)) - f(g(a))} {x-a} \tag{1}$$ It can be written as $$\frac{f(g(x))-f(g(a))}{g(x)-g(a)}\cdot \frac{g(x) - g(a)} {x-a} \tag{2}$$ provided $g(x) - g(a) \neq 0$ for all $x$ in some deleted neighborhood of $a$. Under this assumption the usual proof works and we get the result $(f\circ g) '(a) =f' (g(a)) g'(a) $ by taking limit as $x\to a$ in equation $(2)$.

Let's see what happens when this assumption does not hold. It means that in every deleted neighborhood of $a$ we have some $x$ for which $g(x) =g(a) $. It is easy to prove that in this case we have $g'(a) =0$ (prove this and let me know if you need help here, you can start by assuming $g'(a) >0$ and try to get a contradiction and similarly handle $g'(a) <0$). Now we can see that if $g(x) =g(a) $ then the difference quotient in $(1)$ is $0$. And if $g(x) \neq g(a) $ then the difference quotient can be written as in $(2)$ and the first factor is bounded (because $f'(g(a)) $ exists) and second factor tends to $0$ so that the overall product also tends to $0$ as $x\to a$ and thus $(f\circ g) '(a) =0$. The reasoning in the last sentence can be formalized with the definition of limit as shown below.

Let $\epsilon >0$ be arbitrary. There exists a $\epsilon' >0$ such that $$\left|\frac{f(y) - f(g(a))} {y-g(a)} - f'(g(a)) \right|<1$$ for all $y$ with $0<|y-g(a)|<\epsilon '$. Therefore $$\left|\frac{f(y) - f(g(a))} {y-g(a)} \right|<|f' (g(a)) |+1=K\text{(say)}\tag{3}$$ whenever $0<|y-g(a)|<\epsilon '$. Next note that $g$ is continuous at $a$ (because it is differentiable at $a$) therefore we have a $\delta_{1}>0$ such that $$|g(x) - g(a) |<\epsilon' \tag{4}$$ whenever $|x-a|<\delta_{1}$. Further since $g'(a) =0$ there is a $\delta_{2}>0$ such that $$\left|\frac{g(x) - g(a)} {x-a} \right|<\frac{\epsilon} {K}\tag{5} $$ whenever $0<|x-a|<\delta_{2}$. Let $\delta=\min(\delta_{1},\delta_{2})$. If $0<|x-a|<\delta$ then both the inequalities $(4)$ and $(5)$ hold. Further if $g(x) =g(a)$ then difference quotient in $(1)$ is $0$ and if $g(x)\neq g(a) $ then by all the previous equations we can see that the difference quotient in $(1)$ is less than $\epsilon$ in absolute value. In other words we have $$\left|\frac{f(g(x)) - f(g(a))} {x-a} \right|<\epsilon$$ whenever $0<|x-a|<\delta$. Thus $(f\circ g) '(a) =0$.

The above proof is taken from Hardy's A Course of Pure Mathematics and it avoids the trick used by Spivak (defining the difference quotient $(1)$ in a continuous manner when $g(x) =g(a) $). The essential idea of the proof is easy to understand and the last part of the proof dealing with $\epsilon, \delta$ is necessary only to satisfy those who insist.

**Bumbble Comm** · Accepted Answer

There are two things wrong with your original proof, and the "EDIT" section of the original post is also wrong.

First problem: To define a function $E$, you have to say how to apply $E$ to an arbitrary number $h$. You haven't done that. Here is a better definition of $E$: $E(0) = 0$, and if $h \ne 0$ then \begin{equation} E(h) = \frac{f(g(a)+h) - f(g(a))}{h} - f'(g(a)). \end{equation} For $h \ne 0$, the formula defining $E(h)$ can be rearranged to read: \begin{equation} (E(h) + f'(g(a))) \times h = f(g(a)+h) - f(g(a)). \end{equation} But notice that this last equation is also true if $h=0$, since both sides are $0$, so the equation is true for all values of $h$. Plugging in $g(x)-g(a)$ for $h$, we get \begin{equation} (E(g(x)-g(a))+f'(g(a))) \times (g(x)-g(a)) = f(g(x))-f(g(a)). \end{equation} This is (almost) the same as your "in any case" equation.

Second problem: In your final calculation, you are mixing up the derivative with the value of the derivative at a particular point. The limit \begin{equation} \lim_{x \to a} \frac{f(g(x))-f(g(a))}{x-a} \end{equation} doesn't give you the derivative, it gives you the value of the derivative at $a$. So the proof should end like this: \begin{align} \left.\frac{d}{dx}f(g(x))\right|_{x=a} &= \lim_{x \to a} \frac{f(g(x))-f(g(a))}{x-a}\\ &= \lim_{x \to a} (E(g(x)-g(a))+f'(g(a))) \times \frac{g(x) - g(a)}{x-a}\\ &= f'(g(a))g'(a). \end{align}

There is a subtle point in the last step that you may be missing. Since $g$ is differentiable at $a$, it is continuous at $a$, so $\lim_{x \to a} (g(x) - g(a)) = g(a)-g(a) = 0$. But why does it follow that $\lim_{x \to a}E(g(x)-g(a)) = E(0) = 0$? The answer is: because $E$ is continuous at $0$. (Look in your calculus book in the section on continuous functions. You will find a theorem that says that if $\lim_{x \to a} f(x) = L$ and $g$ is continuous at $L$, then $\lim_{x \to a} g(f(x)) = g(L)$. That theorem is being used in this step.) So to have a complete proof, you need to verify that $E$ is continuous at $0$. To verify that, check that $\lim_{h \to 0} E(h) = 0 = E(0)$. In this limit, $h$ is approaching $0$ but it is not equal to $0$, so we can use the formula for $E(h)$ when $h \ne 0$: \begin{equation} \lim_{h \to 0} E(h) = \lim_{h \to 0} \left(\frac{f(g(a)+h)-f(g(a))}{h} - f'(g(a))\right) = f'(g(a))-f'(g(a)) = 0. \end{equation}

Finally, the problem with the "EDIT" section of the original post: You seem to think that by defining $E$, we are somehow changing the meaning of the expression \begin{equation} \frac{f(g(x))-f(g(a))}{g(x)-g(a)}. \end{equation} We are not. That expression still means what it meant before, so it is undefined when $g(x) = g(a)$. All we're doing is defining a new function $E$, and it is only formulas involving the letter $E$ whose meaning is affected by that definition. No justification is needed for this--you can define a new function however you want.

Why is the "correct" proof of the chain rule correct? What is actually happening here?

There are 5 best solutions below

Related Questions in CALCULUS

Related Questions in DERIVATIVES

Related Questions in CHAIN-RULE

Trending Questions

Popular # Hahtags

Popular Questions