Formal definition of the Differential of a function

4.1k Views Asked by At

The formal definition of the differential of a differentiable function $f: x \mapsto y=f(x)$ is that it's a two-variable function, its name is $df$ and its value is $df(x,\Delta_X) = f'(x)\cdot\Delta_X$.

It's used by Courant for instance and i read in Wikipedia ( http://en.wikipedia.org/wiki/Differential_of_a_function#CITEREFCourant1937i ) that it's the modern treatise of differentials in differential calculus .

I'm trying to see how do we go from that to $df(x) = f'(x) dx$ and then if $y=f(x)$, to the usual $dy = f '(x) dx$ that we see everywhere regarding linear approximation.

First of all, what would $dx$ mean ? Is it the differential of what function ? What about $dy$ or $df(x)$, is it the differential of what function ? What would be the values of those differentials ?

Since the formal definition of differentials treats it like a function i can't understand what these symbols "$dx$" and "$dy$" actually mean in the usual context.
Any help highly appreciated.

3

There are 3 best solutions below

6
On BEST ANSWER

Formally, $x:M\to\mathbb R$ is a map from a manifold $M$ into the reals.

For one dimensional calculus, the manifold $M$ is usually taken to be $\mathbb R$ or a region thereof. $x(p)$ is a function used as a coordinate, and it tells you where on the manifold you are. Its argument is the abstract point on the manifold. Therefore, the manifold is the set of all possible points you might be sitting at. You usually think of just one point at a time.

$y=f(x)$ is also a function on the manifold, and by the chain rule $\mathrm d y|_p= f'(x(p)) \mathrm d x|_p$ at $p$, a point on the manifold.

You could also view $y$ as a local coordinate and then $x=x(y)$ locally and so on.


A vector field $X^a$ on a manifold $M$ is a map from functions $f$ to their rate of change along that vector $X(f)=X^a\partial_a f$ in any coords. In one dimension, a vector field has one component so we can write it as $X$. In fact, we can interpret their action for small values $\Delta (f) = f' \times \Delta$ as being a predictor of the results of a small change in position (the flow along the integral curves of $\Delta$.

A differential of a function $\mathrm d f$ is a map from vector fields to functions given by $\mathrm d f(\Delta) \equiv \Delta (f)$. That is,

differentials of functions are maps from vector fields to the derivative/small change along that vector field of the function or maps from vector fields/small changes and points to real numbers, which give the small change in that function at that point induced by following the vector field away from that point

Therefore $\mathrm d x$ just stores the information about how fast the coordinate $x$ changes. You make arguments like this: $$(\mathrm d f(x))(\Delta)(p)=\Delta(f(x(p)))= \Delta^a(p)\partial_a f(x(p)) = \Delta^a(p) \partial_a x(p)\times f'(x(p)) = f'(x(p)) \Delta (x(p)) = f'(x(p)) (\mathrm d x)(\Delta)(p)$$ and by linearity comparing the left and right we deduce $$\mathrm d f = f'(x) \mathrm d x$$

You can figure out a 'small change' interpretation of all this because the definition of a vector field is exactly what it needs to be for this to work.

Note: By $\partial_a$ I mean a derivative with respect to the $a$th coordinate which is arbitrary.

2
On

Since Newton and Leibniz, several approaches appeared to target this question.

Let me mention one of them:

One can realize real infinitesimals, as real sequences $(d_1,d_2,d_3,\dots)$ which tend to $0$. Two sequences represent the same 'number' if they differ only in finite coordinates. Moreover, any first order formula is true if it is true except for finite places. (Actually, we take an ultrapower $^*{\Bbb R}:=\Bbb R^{\Bbb N}/{\mathcal U}$ with some nonprincipal ultrafilter $\mathcal U$ on $\Bbb N$.) The original real numbers are embedded in ${}^*{\Bbb R}$ as the constant sequences.
The important consequence is, there are infinitesimals in ${}^*\Bbb R$, for example $$\delta:=(1,\frac12,\frac13,\frac14,\dots)$$ which is bigger than $0$, as it holds for all coordinates, and it is smaller than $\frac1n$ for all $n$ because that fails only in finitely many coordinates.
Let $a\simeq b$ mean that $b-a$ is infinitesimal (i.e. $b-a\in (-1/n,1/n)$ for all $n\in\Bbb N$).

In this setting, we can fix any infinitesimal, and call it $dx$. If a function $f:\Bbb R\,\to\Bbb R$ is given, it extends to ${}^*\Bbb R$, by applying it in each coordinates. Then we have that $f$ is differentiable in point $x$ with derivative $f'(x)$ iff $\displaystyle\frac{f(x+dx)-f(x)}{dx}\simeq f'(x)$ for any infinitesimal $dx$.
(So, if we already know that $f$ is differentiable, then we have to check only for one $dx$.)

Now, as $dx$ is considered fixed, the other ones can be given as $$dy:=df(x)=f(x+dx)-f(x)\,.$$ Then we indeed have $\displaystyle\frac{dy}{dx}\simeq f'(x)$, as well as $dy\simeq f'(x)\cdot dx$. And everything gets in place.

An important point is, that sometimes it is useful to change and take another infinitesimal as the base one, say $dy$, and express everything by that...

0
On

Differentials are a case where the common formal definition found in elementary calculus courses has little to do with the original meaning of the concept.

The original meaning of the concept, is an infinitesimal (infinitely small) change in something. $\Delta x$ is a finite change in $x$, but $dx$ is an infinitesimal change in $x$. So $\frac{\Delta y}{\Delta x}$ can be used as the slope of a line, but for the analogous concept for functions in general (the derivative), we need to take $\Delta y$ and $\Delta x$ so small that that the segment of the function they delimit is a line segment. Of course, for a lot of functions, that segment won't be a line segment however small $\Delta x$ and $\Delta y$ are made, so they must be made infinitely small, and the derivative is expressed $\frac{dy}{dx}$.

This was the original meaning of differentials conceived by Leibniz, and it is still used all the time in applied math. The derivative rules are very easily, though non-rigorously, derived with this "definition" (see Thompson's "Calculus Made Easy").

The definition $dy=y'(x)dx$ is a modern definition designed to retroactively justify Leibniz' notation. $dx$ is just any finite real number, for this system, so now we're allowed to say $y'(x)=\frac{dy}{dx}$. But as you can see, $dy$ and $dx$ really aren't infinitesimals anymore, they are just $\Delta y$ and $\Delta x$ for the tangent line approximation. It still works formally, but now $y'(x)=\frac{dy}{dx}$ seems very circular.

There are formal ways of expressing differentials that retain their original meaning as well, but they are more complicated. These ways are described in some of the other answers.