I'm reading in Tenenbaum and Pollard's Ordinary Differential Equations where they introduce the concept of the differential. Suppose $y=f(x)$ is differentiable. He defines the differential by $dy(x, \Delta x)=f'(x)\Delta x$, and explains that if we think of $dx$ as the differential of the function $x\mapsto x$, then we can write $dy=f'(x)dx$, and this continues to hold when $x$ is a function of another variable, $t$. On page 51-52, he writes
The first-order differential equations we will study in this chapter can be written in the form $$Q(x,y)\frac{dy}{dx} + P(x,y)=0.\tag{6.6}$$ Written in this form, it is assumed that $x$ is the independent variable and $y$ is the dependent variable. If we multiply (6.6) by $dx$, it becomes $$P(x,y)dx + Q(x,y)dy=0.$$ Written in this form, either $x$ or $y$ may be considered as being the dependent variable. In both cases, however, $dy$ and $dx$ are differentials, and not increments.
I'm a little shaky with the notion of dividing by $dx$ in the first place. I guess $dy/dx$ has a removable singularity at $dx=0$ which we can fill in, giving us $(dy/dx)(x)=f'(x)$ for all $x$, whether $x$ depends on some other variable(s) or not. Is that the way I should think of it?
Another thing that concerns me is the potential switching of the dependency. If a solution $y(x)$ is not injective, how can we arbitrarily decide to think of $y$ as the dependent variable? Or perhaps we'll get singularities in our solution where $y'(x)=0$?
I have actually worked with differential forms on smooth manifolds before, so I'm happy with an answer where we think of them as smooth covector fields (here I guess $\Delta x$ is an element of the tangent space at $x$). I would feel wrong dividing by a smooth covector field unless I knew it was nonzero!
Yeah, this seems like a common bit of voodoo. We can't really multiply and divide by differentials, but we can do something like this: imagine a vector-valued function $\ell(t)$, where $\ell: \mathbb R \mapsto \mathbb R^2$. Let $V: \mathbb R^2 \mapsto \mathbb R^2$ be a vector field such that $V(x,y) = P(x,y) \hat x + Q (x,y) \hat y$.
Clearly, if we integrate the $V$ on $\ell$, we can get
$$\int V \cdot \frac{d\ell}{dt} \, dt$$
If we write $\ell(t) = \bar x(t) \hat x + \bar y(t) \hat y$, we get
$$\int P(\bar x, \bar y) \bar x'(t) + Q(\bar x, \bar y) \bar y'(t) \, dt$$
It's still important, I think, to distinguish between the coordinates $x, y$ and the scalar functions $\bar x(t), \bar y(t)$.
Now, the paramterization is arbitrary. We can, for instance, choose as our parameterization $\bar x(t) = t$. Or rather, we can just use $x$ itself for the parameter, and as such, $\bar x' = 1$, so we get
$$\int P(x, \bar y(x)) + Q(x, \bar y(x)) \frac{d \bar y}{dx} \, dx$$
So all of this can be phrased in terms of a rigorous set of ideas. Usually, when integrating over curves, the distinction between a coordinate (like $y$) and a component function of the parameter (like $\bar y(t)$) is dropped completely. Usually, we can understand that this is exactly what's meant, and so it feels redundant, but I think from a pedantic perspective, it's helpful to maintain the distinction.
Overall, there's no need to "divide" any differentials; all we have here are notations for derivatives. Choosing to parameterize with respect to $x$ does have dangers: not when $dy/dx = 0$ (these are well-handled) but when the derivative does not exist (is infinite). Curves that might otherwise be smooth and differentiable can give problems when the derivative $dy/dx$ does not exist. Still, since the thrust of this topic is ODEs, such issues should seldom crop up.