I saw this proof of the Chain Rule on Hardy's A Course of Pure Mathematics (the notation I use will be a little different).
• Chain Rule:
Let $f\,\colon Y \subset \mathbb{R} \to \mathbb{R}$, $g\,\colon X \subset\mathbb{R} \to Y$, and $f \circ g\,\colon X \to \mathbb{R}$ be real-valued functions. Suppose that $f \circ g$ is differentiable at $x_0$, $f$ is differentiable at $g(x_0)$, and $g$ is differentiable at $x_0$. If $y = f(g(x))$ and $u = g(x)$, then $$\begin{align*} (f \circ g)'(x_0) &= (f' \circ g)(x_0) \cdot g'(x_0) \\[5pt] &\text{ or } \\[5pt] (f \circ g)'(x_0) &= f'(g(x_0)) \cdot g'(x_0). \\[5pt] &\text{ or } \\[5pt] \left. \frac{\mathrm{d}y}{\mathrm{d}x} \right|_{x=x_0} &= \left. \frac{\mathrm{d}y}{\mathrm{d}u} \right|_{u=g(x_0)} \cdot \left. \frac{\mathrm{d}u}{\mathrm{d}x} \right|_{x=x_0} \\[5pt] &\text{ or } \\[5pt] \left. \frac{\mathrm{d}f \circ g}{\mathrm{d}x} \right|_{x=x_0} &= \left. \frac{\mathrm{d}f}{\mathrm{d}u} \right|_{u=g(x_0)} \cdot \left. \frac{\mathrm{d}u}{\mathrm{d}x} \right|_{x=x_0} \end{align*}$$
• Proof:
Let $u_0 = g(x_0)$, $k = g(x_0 + h) - g(x_0)$, so that $k \to 0$ when $h \to 0$ and $$ \lim_{h \to 0} \frac{k}{h} = g'(x_0) \tag{i} $$ We must now distinguish two cases.
(a) Case 1:
Suppose that $g'(x_0) \neq 0$, and that $h$ is small but not zero. Then, $k \neq 0$ because of (i), and $$\begin{align*} (f \circ g)'(x_0) &= \lim_{h \to 0} \frac{f(g(x_0 + h)) - f(g(x_0))}{h} \\[5pt] &= \lim_{k \to 0} \frac{f(u_0 + k) - f(u_0)}{k} \cdot \lim_{h \to 0} \frac{k}{h} \\[5pt] &= f'(u_0) \cdot g'(x_0) = f'(g(x_0)) \cdot g'(x_0). \end{align*}$$
(b) Case 2:
Suppose that $g'(x_0) = 0$, and that $h$ is small but not zero. There are now two possibilities. If $k = 0$, then $$\begin{align*} (f \circ g)'(x_0) &= \lim_{h \to 0} \frac{f(g(x_0 + h)) - f(g(x_0))}{h} \\[5pt] &= \lim_{h \to 0} \frac{f(u_0 + k) - f(u_0)}{h} \\[5pt] &= \lim_{h \to 0} \frac{f(u_0) - f(u_0)}{h} \\[5pt] &= 0 = f'(g(x_0)) \cdot \underbrace{g'(x_0)}_{0}. \end{align*}$$
If $k \neq 0$, then $$\begin{align*} (f \circ g)'(x_0) &= \lim_{h \to 0} \frac{f(g(x_0 + h)) - f(g(x_0))}{h} \\[5pt] &= \lim_{k \to 0} \frac{f(g(x_0) + k) - f(g(x_0))}{k} \cdot \lim_{h \to 0} \frac{k}{h} \\[5pt] &= f'(g(x_0)) \cdot \underbrace{g'(x_0)}_{0} = 0. \tag*{$\blacksquare$} \end{align*}$$
• My Questions:
$1.$ In Case (2), why do we have to consider the case of $k \neq 0$ when $g'(x_0) = 0$?
I tried looking around and searching for functions where this does occur because I was confused on how $k$ can possibly be $\neq 0$ when $g'(x_0) = \lim\limits_{h \to 0} \frac{h}{k} = 0$.
I did find one example but it seems to be specifically a piecewise function ($x_0 = 0$):
$$\begin{align*} g(x) &= \begin{cases} x^2 \sin\left(\dfrac{1}{x}\right)& \text{if $x \neq 0$,} \\ 0 & \text{if $x = 0$.} \end{cases} \\[10pt] g'(0) &= \lim_{h \to 0} \frac{g(0 + h) - g(0)}{h} \\[10pt] &= \lim_{h \to 0} \frac{h^2\sin\left(\dfrac{1}{h}\right) - 0}{h} \\[10pt] &= \lim_{h \to 0} h \sin\left(\dfrac{1}{h}\right) \\ &= 0 \end{align*}$$
because $$\begin{align*} -1 &\leq \sin\left(\dfrac{1}{h}\right) \leq 1, \\[5pt] -\lim_{h \to 0}\, \lvert h \rvert &\leq \lim_{h \to 0} h \sin\left(\dfrac{1}{h}\right) \leq \lim_{h \to 0}\, \lvert h \rvert, \\[5pt] 0 &\leq \lim_{h \to 0} h \sin\left(\dfrac{1}{h}\right) \leq 0, \\[5pt] 0 &\leq g'(0) \leq 0. \tag{Squeeze Theorem} \end{align*}$$ Are there any examples of functions such that we need to consider proving the Chain Rule for the case of $k \neq 0$ but $g'(x_0) = 0$? Specifically, my question is do we really need to consider that case *if* we're strictly only considering proving the Chain Rule for non-piecewise differentiable functions.
$2.$ Is the Leibniz notation of the Chain Rule 'correct'? or can it be written more formally?
$3.$ Is this proof general and rigorous enough for differentiating single-variable function compositions? If not, can you show me a better and more rigorous proof, please (I haven't studied Real Analysis but I am somewhat comfortable with $\varepsilon-\delta$ proofs though I can't come up with them on my own).
Thank you.