Can the chain rule be proven by math induction?

983 Views Asked by At

I need to prove the chain rule for a math project and I am wondering if it can be proven by math induction. If not, how can this rule be proven?

1

There are 1 best solutions below

1
On

I will outline a proof here that is really adapted from an old calculus text by James Stewart. This is not from an analysis book but from a standard calculus text; thus, the rigor is still maintained but not quite as terse. A proof from Rudin is offered at the end for a different style.


Prefatory comments: Recall that if $y=f(x)$ and $x$ changes from $a$ to $a+\Delta x$, we define the increment of $y$ as $$ \Delta y=f(a+\Delta x)-f(a). $$ According to the definition of a derivative, we have $$ \lim_{\Delta x\to 0}\frac{\Delta y}{\Delta x}=f'(a). $$ So if we denote by $\epsilon$ the difference between the difference quotient and the derivative, we obtain $$ \lim_{\Delta x\to 0}\epsilon = \lim_{\Delta x\to 0}\left(\frac{\Delta y}{\Delta x}-f'(a)\right)=f'(a)-f'(a)=0. $$ But $$ \epsilon=\frac{\Delta y}{\Delta x}-f'(a)\quad\Longrightarrow\quad\Delta y = f'(a)\Delta x+\epsilon\Delta x.

If we define $\epsilon$ to be $0$ when $\Delta x=0$, then $\epsilon$ becomes a continuous function of $\Delta x$. Thus, for a differentiable function $f$, we can write $$ \Delta y=f'(a)\Delta x+\epsilon\Delta x\quad\text{where}\quad\epsilon\to 0\quad\text{as}\quad\Delta x\to 0,\tag{1} $$ and $\epsilon$ is a continuous function of $\Delta x$. This property of differentiable functions is what enables us to prove the Chain Rule.


Proof: Suppose $u=g(x)$ is differentiable at $a$ and $y=f(u)$ is differentiable at $b=g(a)$. If $\Delta x$ is an increment in $x$ and $\Delta u$ and $\Delta y$ are the corresponding increments in $u$ and $y$, then we can use equation $(1)$ to write $$ \Delta u=g'(a)\Delta x+\epsilon_1\Delta x=[g'(a)+\epsilon_1]\Delta x,\tag{2} $$ where $\epsilon_1\to 0$ as $\Delta x\to 0$. Similarly, $$ \Delta y = f'(b)\Delta u+\epsilon_2\Delta u=[f'(b)+\epsilon_2]\Delta u,\tag{3} $$ where $\epsilon_2\to 0$ as $\Delta u\to 0$. If we now substitute the expression for $\Delta u$ from equation $(2)$ into equation $(3)$, we get $$ \Delta y=[f'(b)+\epsilon_2][g'(a)+\epsilon_1]\Delta x $$ so $$ \frac{\Delta y}{\Delta x}=[f'(b)+\epsilon_2][g'(a)+\epsilon_1]. $$ As $\Delta x\to 0$, equation $(2)$ shows that $\Delta u\to 0$. So both $\epsilon_1\to0$ and $\epsilon_2\to0$ as $\Delta x\to 0$. Therefore, \begin{align} \frac{dy}{dx} &= \lim_{\Delta x\to0}\frac{\Delta y}{\Delta x}\\[1em] &= \lim_{\Delta x\to 0}[f'(b)+\epsilon_2][g'(a)+\epsilon_1]\\[1em] &= f'(b)g'(a)=f'(g(a))g'(a). \end{align} This proves the Chain Rule (see below for a much more terse treatment). $\Box$


Proof of Chain Rule in Baby Rudin: The Chain Rule is first stated as a theorem:

Theorem. Suppose $f$ is continuous on $[a,b], f'(x)$ exists at some point $x\in[a,b], g$ is defined on an interval $I$ which contains the range of $f$, and $g$ is differentiable at the point $f(x)$. If $$ h(t) = g(f(t))\qquad(a\leq t\leq b), $$ then $h$ is differentiable at $x$, and $$ h'(x) = g'(f(x))f'(x).\tag{1} $$

Proof. Let $y=f(x)$. By the definition of the derivative, we have $$ f(t)-f(x)=(t-x)[f'(x)+u(t)],\tag{2} $$ and $$ g(s)-g(y)=(s-y)[g'(y)+v(s)],\tag{3} $$ where $t\in[a,b], s\in I$, and $u(t)\to 0$ as $t\to x, v(s)\to 0$ as $s\to y$. Let $s=f(t)$. Using first $(3)$ and $(2)$, we obtain \begin{align} h(t)-h(x) &= g(f(t))-g(f(x))\\ &= [f(t)-f(x)]\cdot[g'(y)+v(s)]\\ &= (t-x)\cdot[f'(x)+u(t)]\cdot[g'(y)+v(s)], \end{align} or, if $t\neq x$, $$ \frac{h(t)-h(x)}{t-x}=[g'(y)+v(s)]\cdot[f'(x)+u(t)].\tag{4} $$ Letting $t\to x$, we see that $s\to y$, by the continuity of $f$, so that the right side of $(4)$ tend to $g'(y)f'(x)$, which gives $(1)$. $\Box$