I am a bit confused about differentials, and this is probably partly due to what I find to be a rather confusing teaching approach. (I know there are a bunch of similar questions around, but none of them clarified my confusion).
When first meeting derivatives in calculus, the magic $d$ symbol first appeared in the Leibniz notation for derivatives as $\frac{df}{dx}$, and got told that it's a symbol for differentiation, not a fraction. The chain rule, saying that $\frac{df}{dx} = \frac{df}{du}\frac{du}{dx}$ was taught along the lines "looks like fraction simplification, but be careful".
Then integrals came around, with not much being said about the $dx$ at the end, until the substitution rule. Then, I made contact with differentials, but merely saying that when changing the variable, one needs also to change the differential $dx$ to $du=f'dx$. Now, it seems to me that $du=f'dx$ comes a bit from $f'=\frac{df}{dx}$.
When getting a bit into higher maths, there is more and more operations with functions/differentials. When doing surface areas, we talk about area differential, with $ds^2 = dx^2 + dy^2$, or differential of a multivariable function as $df = \sum\frac{\partial f}{\partial xi}dx_i$.
It seems like, in some cases, we do operate with the differential as its a simple real value, where the idea of an infinitesimal (basically $\Delta x = x - x_1$ as $x_1 \rightarrow x$.
Possibly the most important thing for me now, it seems like I can see the differential $df$ as a local, linear approximation of $f$ and $ \sum\frac{\partial f}{\partial xi}dx_i$ as a decomposition of the tangent along the basis directions.
But, at the same time, I remain stuck with the lack of an exact definition of the differential and a bit of fear of using it due to warnings such as "chain rule is not really a fraction simplification".
A very general definition of the differential is the Fréchet derivative:
Some important special cases (note that I use $x\cdot y$ exclusively for the product of two real numbers $x,y$ or for the product of a vector $x\in\mathbb R^n$ and a real number $y\in\mathbb R$):
About your notations: $\frac{\mathrm df}{\mathrm dx}$ is simply a different name for $f'$ where $f$ is as in 2.
$\frac{\mathrm df}{\mathrm du}=\frac{\mathrm du}{\mathrm dx}$ is a (maybe a little bit confusing) way of writing the chain rule:
If $f,g$ are as in 1., then $(f\circ g)'(x)=f'(g(x))\cdot g'(x)$ which can be reformulated as $$\mathrm d(f\circ g)(x) = \mathrm df(g(x))\circ\mathrm dg(x).$$
A very general case of the chain rule:
If $f:\mathbb R^m\to\mathbb R^k$ and $g:\mathbb R^n\to \mathbb R^m$ are as in 4., then $$\mathrm d(f\circ g)(x) = \mathrm df(g(x))\circ\mathrm dg(x),$$ i.e. $$\operatorname{Jac}(f\circ g)(x)=\operatorname{Jac}f(g(x))\cdot\operatorname{Jac}g(x).$$