Proof of the chainrule: is this proof correct and did I use the right notation?

286 Views Asked by At

I created this proof of the chainrule. Being a (relative) beginner at math I have a few questions.

  1. Is the proof below correct? I was especially in doubt about the use of $h$ on both sides.
  2. Is the (Langrange?) notation correct this way?
  3. How to write the same proof using Leibniz's notation? I wrestled writing this proof in Leibniz notation, because what would in that case be the meaning of $dg$? Is it $g(x+h)-g(x)$ or $k$ or $h$?

To be proved:

If $f(u)$ is differentiable at $u=g(x)$, and $g(x)$ is differentiable at $x$ then:

$$f(g(x))'\stackrel{?}{=}f'(g(x))g'(x)$$ Or similarly $$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}\stackrel{?}{=}\lim \limits_{k \to 0}\frac{f(g(x)+k)-f(g(x))}{k}\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}$$

Case 1: if $h$ has a value such that $g(x+h)=g(x)$ then: $$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}=0$$ And $$\lim \limits_{k \to 0}\frac{f(g(x)+k)-f(g(x))}{k}\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}=0$$

Both sides of the equation to prove equal zero, therefore the equation holds in this case.

Case 2: if $h$ has a value such that $g(x+h)\ne g(x)$ then:

We multiply the lefthandside by $\frac{g(x+h)-g(x)}{g(x+h)-g(x)}$ $$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}=\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)}\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}$$ Taking $$u=g(x)$$ $$k=g(x+h)-g(x)$$ We get $$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}=\lim \limits_{h \to 0}\frac{f(u+k)-f(u)}{k}\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}$$ And as $h\to 0, k\to 0$, therefore $$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}=\lim \limits_{k \to 0}\frac{f(u+k)-f(u)}{k}\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}$$ Thus $$f(g(x))'=f'(u)g'(x)=f'(g(x))g'(x) \tag*{$\blacksquare$}$$

6

There are 6 best solutions below

5
On BEST ANSWER

One thing that's worth learning is the notation for the composition of $f$ and $g.$ We use $f\circ g$ to denote this function. I.e., $(f\circ g)(x) = f(g(x)).$

In your To be proved, we can use this notation. You wrote $f(g(x))' = f'(g(x))g'(x).$ The problem with this is that on the left you have $'$ followed by blank space, whereas on the right the $'$ symbols are followed by further notation. It's an inconsistency that can be corrected by writing

$$(f\circ g)'(x)=f'(g(x))g'(x).$$

On your paragraph that starts "Or similarly": You have

$$\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}$$ $$=\lim \limits_{k \to 0}\frac{f(g(x)+k)-f(g(x))}{k}\cdot\lim \limits_{h \to 0}\frac{g(x+h)-g(x)}{h}.$$

That is fine. You asked about having $h$ on both sides, but that is no problem whatsoever. In fact you could replace $k$ by $h$ on the right. The variable $h$ is called a "dummy variable", meaning any symbol could be used there (except for the symbols that already have meaning, like $x.$)

That's the small stuff. Others have pointed out the big mistakes, where you divide the proof into two cases: i) There is an $h\ne 0$ such that $g(x+h)-g(x) =0,$ and ii) There is an $h\ne 0$ such that $g(x+h)-g(x) \ne 0.$ Huge problem here: a few values of $h$ cannot tell you anything about about a limiting process, where we are letting $h\to 0$ through infinitely many values.

A better division into cases is this: case i) $g'(x)=0;$ case ii) $g'(x)\ne 0.$ How would the proofs go in these cases?

Proof for case i): Observe the following: There is a constant $C$ such that

$$|f(y)-f(g(x))|\le C|y-g(x)|$$

for all $y$ sufficiently close to $g(x).$ This follows from the existence of $f'(g(x)).$ Thus for small $h\ne 0,$

$$\left |\frac{f(g(x+h))-f(g(x))}{h} \right | \le C\frac{|g(x+h))-g(x)|}{|h|} \to C\cdot |g'(x)|=0.$$

Thus $(f\circ g)'(x) = 0,$ which is exactly what we want in this case.

Proof for case ii): This is the easy case. We need only observe that $g'(x)\ne 0$ implies $g(x+h)-g(x)\ne 0$ for all small nonzero $h.$ For such $h$ we can do what all beginners crave to do:

$$\frac{f(g(x+h))-f(g(x))}{h} = \frac{f(g(x+h))-f(g(x))}{g(x+h)-g(x)}\frac{g(x+h)-g(x)}{h} \to f'(g(x))g'(x).$$

I've been brief in these proofs. Please ask if you have questions.

3
On

The main reason that this is not written well is that you are manipulating limits. This is very hard to read because there is a non-trivial claim built into each use of "$\lim$".

Unless you have separately verified it, the fact that the function $f \circ g$ is differentiable at $x$ is really part of the conclusion, so if you start with this limit on the left-hand side and then manipulate that expression, in my book you have committed an unforgivable analysis sin immediately.

If you are just doing algebra, then just write the algebra without $\lim$ in front of everything.

If you are taking a limit, justify it exists first, and then take the limit.

2
On

This answer will attempt to elaborate further on the problem that Medo pointed out in the comments:

Fix some value of $x$ and consider the sets $$ H=\{h\mid\ g(x+h)= g(x)\} \\ H^c=\{h\mid\ g(x+h)\neq g(x)\} $$ Here $H$ corresponds to your case 1, and $H^c$ corresponds to your case 2.

  • Either set may be infinite (countable or uncountable) or empty and $H$ may also be finite.
  • So first of all, there is no guarantee that $h=0$ is a limit point of both sets meaning $h\to 0$ may not even make sense within both sets. In that case one can simply argue using the set in which $h=0$ is in fact a limit point, and everything should be fine.
  • Second of all, if $h=0$ is a limit point of both $H$ and $H^c$ so that $h\to 0$ makes senses for both, we need an extra argument to make sure that the two limits are equal. This is where some work remains to be done.

The way I have developed for dealing with the problem uses a more Leibnizian approach, namely define: $$ \frac{\Delta f(g)}{\Delta x} = \begin{cases} 0 & \text{for }\Delta g=0 \\ \quad\\ \frac{\Delta f}{\Delta g}\cdot\frac{\Delta g}{\Delta x} & \text{for }\Delta g\neq 0 \end{cases} $$ where $\Delta g$ and $\Delta f$ denote the corrsponding changes of $g(x)$ and $f(g(x))$ when $x$ is changed by $\Delta x$. This can be shown to be continuous and always equal to $$ \frac{f(g+\Delta g)-f(g)}{\Delta x} $$ and so it provides a continuous alternative to the problematic fatorization by "filling in" the missing values when $\Delta g=0$.

2
On

(Note: the proof has been edited so the comment below no longer applies.)

In case 1, you are assuming that $g(x+h) = g(x)$ for all real numbers $h$. In case 2, you are assuming that $g(x+h) \neq g(x)$ for all nonzero real numbers $h$. However, there is a third case that you have not covered, which is the case where $g(x+h)=g(x)$ for some but not all nonzero real numbers $h$.

By the way, the assumptions you made in each case could have been stated more clearly by inserting phrases such as "for all real numbers $h$". You could also use phrasing like, "if $h$ is a nonzero real number then $g(x+h) \neq g(x)$."


Follow-up comment: The proof has been revised to say:

if $h$ has a value such that $g(x+h)=g(x)$ then: $$\tag{1}\lim \limits_{h \to 0}\frac{f(g(x+h))-f(g(x))}{h}=0$$

But, how does equation (1) follow from the fact that there is a value of $h$ such that $g(x + h) = g(x)$? That is a non-sequitur.

Equation (1) would be obviously true if $g(x + h) = g(x)$ for all real numbers $h$. But case 1 (as written currently) only assumes that there exists a value of $h$ such that $g(x + h) = g(x)$.

0
On

An easy way to avoid the problem with the case $g'(p)=0$ is to perturb $g$ by a linear function: we can evaluate the derivative $$ \frac{d}{dx}f(g(x)+\epsilon x)\Big|_{x=p} $$ using the "Leibniz way" since $g'(p)+\epsilon\neq 0$. Hence, $$ \frac{d}{dx}f(g(x)+\epsilon x)\Big|_{x=p}=f'(g(p)+\epsilon p)(0+\epsilon). $$ The left hand side depends continuously on $\epsilon$, so letting $\epsilon\to 0$ gives $$ \frac{d}{dx}f(g(x))\Big|_{x=p}=0=f'(g(p))g'(p). $$

7
On

Your case 1 is irreparably flawed.$ \def\lfrac#1#2{{\large\frac{#1}{#2}}} $ In that case you claimed that if $g(x+h)=g(x)$ for some $h$ then $\lim_{h\to0} \lfrac{f(g(x+h))−f(g(x))}{h} = 0$. That is false. For example let $f$ be the identity function, and $g = \sin$ and $x = 0$ and $h = π$. Then $g(x+h) = g(x)$ but $\lim_{h\to0} \lfrac{f(g(x+h))−f(g(x))}{h}$ $= \lim_{h\to0} \lfrac{\sin(h)-\sin(0)}{h} = 1$, contradicting your claim.

Your case 2 is also completely broken for the same logical reason, because even if $g(x+h) \ne g(x)$ for some $h$ it does not mean that $\lim_{h\to0} \lfrac{f(g(x+h))−f(g(x))}{g(x+h)-g(x)}$ exists, so your first line in that case is already wrong. For example let $f$ be the identity function again, and $g(t) = |t-1|+|t+1|$ for every real $t$, and $x = 0$. Then $g(2) \ne g(0)$ but $\lim_{h\to0} \lfrac{f(g(x+h))−f(g(x))}{g(x+h)-g(x)}$ does not exist because $g(x+h)-g(x) = 0$ for every $h \in [-1,1]$. Note that $g'(0) = 0$ and $f'(2) = 1$, and the chain-rule still holds, but the limit you wrote down does not exist.