How is the formula for the total derivative with respect to x derived?
The formula is:
$$\large \frac{df}{dx}=\frac{\partial f}{\partial x}+\frac{\partial f}{\partial y}\frac{dy}{dx}+\frac{\partial f}{\partial z}\frac{dz}{dx}...$$
Alternatively dividing both sides by $dx$, gives the formula for total differential:
$$\large df=\frac{\partial f}{\partial x}dx+\frac{\partial f}{\partial y}dy+\frac{\partial f}{\partial z}dz...$$
Wikipedia references the use of chain rule, but never actually goes into deriving the formula. Additionally, the notation of $\large \frac{dy}{dx}$ boggles me a bit, what exactly does this refer to? Is it the case that it can be solved for only by holding the function itself constant?
Let $f(x):U \subseteq \mathbb{R}^n \to \mathbb{R}, x(t):V \subseteq \mathbb{R} \to U$. For any $t$, the requirement of $x$ and $f$ are
The first derivatives of $x_k$($k$th component of $x$. $0 \leq k < n$) exists at $t$.
$f$ is differentiable at $x(t) = a$.
Define $E(h):S \subseteq \mathbb{R}^n \to \mathbb{R}$ which represents the error of the linear approximation $$ E(h) = \frac{1}{\left\lVert h \right\rVert}\left(f(a+h) - f(a) - \sum_{k=0}^{n-1} \frac{\partial f}{\partial x_k}(a) h_k\right) $$ Rearrange $$ f(a+h) - f(a) = \left\lVert h \right\rVert E(h) + \sum_{k=0}^{n-1} \frac{\partial f}{\partial x_i}(a) h_i $$ Let $h_k = x_k(t_0+\sigma) - x_k(t)$, where $\sigma$ is a small value for $x_k(t_0+\sigma)$ to be defined. Observe that $$ \lim_{\sigma \to 0} \frac{h_k}{\sigma} = \frac{dx_k}{dt} $$ $$\begin{align*} \frac{df}{dt} &= \lim_{\sigma \to 0} \frac{f(x(t) + h)-f(x(t))}{\sigma} \\ &= \lim_{\sigma \to 0} \left(\frac{\left\lVert h \right\rVert}{\sigma} E(h) + \sum_{k=0}^{n-1} \frac{\partial f}{\partial x_k}(a) \frac{h_k}{\sigma} \right) \end{align*}$$ Since as $\sigma \to 0 \Rightarrow h \to 0, E \to 0$ by differentiability and continuity, $$ \frac{df}{dt} = \lim_{\sigma \to 0} \left(\frac{\left\lVert h \right\rVert}{\sigma} E(h)\right) + \lim_{\sigma \to 0} \left(\sum_{k=0}^{n-1} \frac{\partial f}{\partial x_k}(a) \frac{h_k}{\sigma} \right) = 0 + \sum_{k=0}^{n-1} \frac{\partial f}{\partial x_k}(a) \frac{dx_k}{dt} $$