So, here's a an attempted proof of the chain rule. I end up with the right formula at the end of the day, but I'm curious what the right way is to substitute a function with a best linear approximation to it inside limits like these when you're trying to find an explicit form for a derivative involving unknown functions.
$f$ and $g$ in the attempted proof below are called/evaluated twice, but with arguments that are very close to each other. Based on that, I replaced $f(x)$ and $g(x)$ with linear approximations centered around one of their arguments.
Is there a way to perform a substitution like this rigorously?
Using a slightly modified form of the definition of the derivative.
$$ x \cdot D(f \circ g) = \lim_{h \to 1}\frac{f(g(hx)) - f(g(x))}{h-1} $$
next, I use the following linear approximations to $f$ and $g$ at strategically chosen points. $$ f(z) \approx cz + d $$
$cz + d$ is the best linear approximation to $f$ at $(g(x), f(g(x))$
$$ g(z) \approx az + b $$
$az + b$ is the best linear approximation to $g$ at $(x, g(x))$
$$ \lim_{h \to 1} \frac{f(ahx + b) - f(ax+b)}{h-1} $$
$$ \lim_{h \to 1} \frac{achx + cb + d - cax - cb - d}{h - 1} $$
$$ \lim_{h \to 1} \frac{ca(h-1)x}{h-1} $$
$$ \lim_{h \to 1} cax $$
And thus:
$$ x \cdot D(f \circ g) = cax = (Df)(g(x)) \cdot (Dg)(x) \cdot x $$
$$ D(f \circ g) = (Df)(g(x)) \cdot (Dg)(x) $$