\begin{aligned} \frac{dy}{dx} = \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} \end{aligned}
Chain Rule says we can solve it this way:
\begin{aligned} \frac{dy}{dx} = \frac{dy}{du} * \frac{du}{dx} \end{aligned}
The above equation comes with below proving process:
1, somehow we think using the below equation is a good idea,
\begin{aligned} \frac{\Delta y}{\Delta x} = \frac{\Delta y}{\Delta u} * \frac{\Delta u}{\Delta x} \end{aligned}
2, then we convert: $\frac{\Delta y}{\Delta x}$ to $ \frac{dy}{dx}$ form
\begin{align} \frac{dy}{dx} &= \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x}\\&= \lim_{\Delta x\to 0} [\frac{\Delta y}{\Delta u} * \frac{\Delta u}{\Delta x}]\\&= \lim_{\Delta x\to 0} [\frac{\Delta y}{\Delta u}] * \lim_{\Delta x\to 0} [\frac{\Delta u}{\Delta x}]\\&= \lim_{\Delta u\to 0} [\frac{\Delta y}{\Delta u}] * \lim_{\Delta x\to 0} [\frac{\Delta u}{\Delta x}]\\&= \frac{dy}{du} * \frac{du}{dx} \end{align}
But since we have replaced: $\lim_{\Delta x\to 0}$ to $ \lim_{\Delta u\to 0}$ , it might not be legitimate to do so, the ${\Delta u}$ might becomes $0$ as${\Delta x \to 0}$
So George Simmons in his book "Calculus With Analytic Geometry" page 94-95 mentioned:
\begin{aligned} \frac{\Delta y}{\Delta u} &= \frac{dy}{du} + \epsilon \\ \Rightarrow \Delta y &= \frac{dy}{du} * \Delta u + \epsilon * \Delta u \\ \Rightarrow \frac{\Delta y}{\Delta x} &= \frac{dy}{du} *\frac{\Delta u}{\Delta x} + \epsilon * \frac{\Delta u}{\Delta x} \\ \Rightarrow \frac{dy}{dx} &= \lim_{\Delta x\to 0} \frac{\Delta y}{\Delta x} \\ &= \lim_{\Delta x\to 0} [\frac{dy}{du} *\frac{\Delta u}{\Delta x} + \epsilon * \frac{\Delta u}{\Delta x}] \\ &= \lim_{\Delta x\to 0} [\lim_{\Delta u\to 0} \frac{\Delta y}{\Delta u} *\frac{\Delta u}{\Delta x} + \epsilon * \frac{\Delta u}{\Delta x}] \\ &= \lim_{\Delta x\to 0} [\frac{\Delta y}{\Delta x} + \epsilon * \frac{\Delta u}{\Delta x}] \\ &= \frac{dy}{dx} + \epsilon * \frac{du}{dx} \end{aligned}
But why we are doing it this way? I truly dont understand it, why not writeing this way? $\lim_{\Delta x\to 0} \frac{\Delta y}{\Delta u} = \frac{dy}{du} + \epsilon$
His argument is that when one writes
\begin{equation}\lim _{{\Delta} u \rightarrow 0} \frac{{\Delta} y}{{\Delta} u} = \frac{d y}{d u}\end{equation}
it actually means
\begin{equation}\lim _{\substack{{\Delta} u \rightarrow 0\\ {\Delta} u \neq 0 }} \frac{{\Delta} y}{{\Delta} u} = \frac{d y}{d u}\end{equation}
so one cannot write
\begin{equation}\frac{{\Delta} y}{{\Delta} u} = \frac{d y}{d u}+{\epsilon}\end{equation}
when there is a risk that $ {\Delta} u = 0$. In the case of the chain rule, he says that $u$ depends on $x$, so when $ {\Delta} x \neq 0$, it is not necessarily true that $ {\Delta} u \neq 0$. So instead of writing the above, he writes
\begin{equation}{\Delta} y = \frac{d y}{d u} {\Delta} u+{\epsilon} {\Delta} u\end{equation}
which remains true even when $ {\Delta} u = 0$.
So the proof of the chain rule is now
\begin{equation}\renewcommand{\arraystretch}{2} \begin{array}{rcl}\displaystyle \lim _{{\Delta} x \rightarrow 0} \frac{{\Delta} y}{{\Delta} x}&=&\displaystyle \lim _{{\Delta} x \rightarrow 0} \left(\frac{d y}{d u} \frac{{\Delta} u}{{\Delta} x}+{\epsilon} \left({\Delta} u\right) \frac{{\Delta} u}{{\Delta} x}\right)\\ &=&\displaystyle \frac{d y}{d u} \lim _{{\Delta} x \rightarrow 0} \frac{{\Delta} u}{{\Delta} x}+\displaystyle \lim _{{\Delta} x \rightarrow 0} {\epsilon} \left({\Delta} u\right) \displaystyle \lim _{{\Delta} x \rightarrow 0} \frac{{\Delta} u}{{\Delta} x}\\ &=&\displaystyle \frac{d y}{d u} \frac{d u}{d x}+{0}\times{\frac{d u}{d x}}\\ &=&\displaystyle \frac{d y}{d u} \frac{d u}{d x} \end{array}\end{equation}
In this derivation, we never divide by $\Delta u$.
I think he could have introduced the Taylor expansion of order 1, which would clarify the argument.