What is the derivative of the derivative?

378 Views Asked by At

If we consider the derivative as a function from a function space to function space, then does it make sense to talk about the derivative of the derivative? In particular, if we consider

$$D:C^\infty[\mathbb{R}]\to C^\infty[\mathbb{R}]$$

and arbitrary smooth function $f\in C^\infty[\mathbb{R}]$, then can we reasonably ask if there is some other function on smooth real functions given by the following?

$$\frac{d}{df}[Df]$$

I've tried to do this myself with limits, where we take an arbitrary smooth function $u$ that approaches the constant function of $0$. In particular, I found that

$$\lim_{u\to 0}\frac{D(f+u)-D(f)}{u}=\lim_{u\to 0}\frac{f'+u'-f'}{u}=\lim_{u\to 0}\frac{u'}{u}.$$

But I'm not sure if that approaches any particular value independent of $u$'s path toward $0$. However, plugging in $u = 0$ does indeed result in an indeterminant form, which leads me to suspect that there is a way of solving this. Instinctually, I think that this value should be $1$ both due to $D$ being considered linear, and applying L'Hopital's rule infinitely many times would result in

$$\lim_{u\to 0}\frac{u'}{u} = \lim_{u\to 0}\lim_{n\to\infty}\frac{u^{(n+1)}}{u^{(n)}}=\lim_{u\to 0}\lim_{n\to\infty}\frac{u^{(n)}}{u^{(n)}}=\lim_{u\to 0}\lim_{n\to\infty}1=1$$

But this does not seem particularly rigorous. Does my question even make sense and if it is, is there a rigorous way of finding the solution?


Edit: To add to the confusion, if we treat deriving with respect to a function as we do in $\mathbb{R}$, we find that $$ \frac{d}{df}[Df] = \frac{df'}{dx}\frac{dx}{df}=\frac{f''}{f'}$$ Which I'm pretty sure is not quite the same as differentiating with respect to the function itself, but I don't fully understand why it would be different.

2

There are 2 best solutions below

1
On BEST ANSWER

This was a question that also bothered me when I was younger - and I was much surprised to find that the only satisfactory answer is that the derivative of the operator $D$ is itself. It's worth understanding why this must be - and not to bog ourselves down with too much analysis (especially since the main hurdle to such analysis is getting the derivative operator to behave well enough to do analysis on at all).

The most important fact about derivatives is that they represent linear approximations near a point. When we speak, in finite dimensions, of a derivative of a function $f:\mathbb R^n\rightarrow\mathbb R^m$, we are really just saying "near a point $p$, $f(v)$ is close to $f(p)+L(v-p)$ for some linear function $L$" - where by "near" we usually mean that the error is $o(|p-v|)$ - meaning that for any $\varepsilon > 0$ there is some $\delta$ so that if $|v - p| < \delta$ then $|f(v) - (f(p) + L(v-p))| \leq \varepsilon|p-v|$. If you're used to working with coordinates, that linear function $L$ is represented by the Jacobian matrix - and the existence of such a matrix is precisely how "differentiability" is usually defined for multiple variables. Sometimes this is stated by saying that $L$ is such that $$\lim_{v\rightarrow p}\frac{f(v)-(f(p)+L(v-p))}{|v-p|} = 0$$ which is another notation for the same thing - note that we can't divide directly by $v-p$ because it's just some vector; it doesn't make sense to try to divide by, for instance "5 miles northeast" - even if that makes perfect sense as a position on a two dimensional grid. Instead, we have to suppose we already have a guess at what the derivative is, and are just seeking to verify it.

There's a notable special case here: if $f$ was already linear, this is trivial: the Jacobian, at every point, is just the matrix that represented $f$ to begin with. For instance, the map $$f:\mathbb R^2\rightarrow\mathbb R$$ $$f(x,y)=2x+y$$ has its Jacobian at every point being, well, the matrix taking $(x,y)$ to $2x+y$ - because, near any point, this function tells us exactly how much a change to the input changes the output - we're not even approximating anything here because the error term is not only sublinear, it's zero.*

The derivative as a map falls into this special case - where "derivatives are linear" just means that $(f+g)' = f'+g'$ and $(\alpha f)'=\alpha(f')$ where $\alpha$ is a scalar (i.e. a real/complex number, depending on the context). Saying "the derivative of $D$ is itself" just says that if we have some value $f$ for which we know $Df$ and want to shift the input $f$ by some vector $g$, the resulting change in the output (which should be approximately the derivative of $D$ applied to the vector $g$) is "about" $Dg$ - where, by "about" we mean "exactly." As a result, under every definition of the derivative I'm aware of that allows us to even pose the question, the answer will be that the derivative of $D$ is $D$.

This can be a bit confusing because you seem tempted to want the derivative to be $$\lim_{u\rightarrow 0}\frac{f'+u'-f'}{u}$$ which kinda makes sense, since $u$ is a function so it kinda makes sense to divide by it, but the notion of a "derivative" treats $u$ as if it were just some vector, not as if it's something we can do algebra on - it's just a confounding coincidence that we could write this expression, not a useful definition of a derivative. Note especially that $u$ could be zero at various points, which basically ensures that this limit does not make sense (even before we run into other problems with the definition - like it taking a limit of a whole function without specify how). Luckily for us, this isn't how derivatives are defined in multiple dimensions - and in terms of linear approximations, it's unavoidable that $D$ is its own derivative.

(*To catch a very reasonable confusion: when we're saying "the derivative of something is itself" it feels like we're discussing exponential functions - but that's a slightly different idea. The one dimensional analog of this is to imagine that $f(x)=2x$ has this property that near say $x$ we can approximate $f(x+\delta)\approx f(x)+2\delta$ - which is what a derivative means. What we mean to note is that this linear approximation $2\delta$ is just $f(\delta)$. This is distinct from saying that $g(x)=e^x$ has that $g(x)=g'(x)$, which instead expands to say that $g(x+\delta)$ is about $g(x)+g'(x)\delta$ - where, of course, the term $g'(x)\delta$ giving the (approximate) change over a small $\delta$ is not $g(\delta)$. This is not the sense we mean when we say $D$ is its own derivative - and is a sense that is limited to one dimension anyways)

0
On

To solve this, let's see what kind of object should the derivative of the derivative be. To see this, let's see what kind of object is the derivative itself. The derivative is $D: C^\infty[\mathbb R] \to C^\infty[\mathbb R]$ a map that takes a function of one variable and returns another function of one variable.

But what are functions in this context?

The functions themselves form a vector space. A vector $\mathbf x$ in $\mathbb R^n$ has components $x_a$ which are indexed by elelments of a finite set, i.e. $a \in \{1,...,n\}$. Similarly, a function $f$ can be thought of as components $f(a)$ which are indexed by elements of $\mathbb R$, i.e. $a\in\mathbb R$.

So, the derivative of the derivative should be analogous to derivative of a function $f: \mathbb R^n \to \mathbb R^n$, which is its Jacobian matrix, which in general also depends on $\mathbb x$, i.e. the derivative is $f':\mathbb R^n \to \mathbb R^{n\times n}$. If the functional analogue of a vector in $\mathbb R^n$ is a function $f(x)$, then the functional analogue of a matrix in $\mathbb R^{n\times n}$ is a function of two variables $f(x,y)$ - two variables act like two indices. Therefore, the derivative of the derivative should be a function that takes a function of one variable and returns a function of two variables (it actually returns a generalized function/distribution of two variable).


Now when it is established what kind of object would the derivative of the derivative be, let's find out some more properties. The derivative $D$ is a linear function, i.e. $$ D(\alpha f + \beta g) = \alpha D(f) + \beta D(g) $$ If a function $L: \mathbb R \to \mathbb R$ is linear, then its Jacobian matrix $L'(\mathbb x)$ is a constant and the following holds: $$ L'(\mathbb x)\mathbb y = L(\mathbb y), $$ where the multiplication on the LHS is matrix multiplication. One might say that this means that $L'$ is the same as $L$, but former is a constant function which returns a matrix, while the later is a function that returns results of the matrix multiplication with said matrix. It is similar, but not quite the same.

Similarly, for the derivative of the derivative, the following holds: $$ D'(f)g = D(g) = g' $$ where a functional equivalent of the matrix multiplication is used.

Therefore, the derivative of the derivative is $D'(f)$ is a function of two variables which is the same for any choice of $f$. Only thing remaining to be answered is which function of two variables. It turns out to be the derivative of the Dirac delta function, i.e. $$ [D'(f)](x,y) = \delta'(x-y), ~~~~ \forall f \in C^\infty[\mathbb R] $$

This can be verified with the functional equivalent of the matrix multiplication: $$ [D'(f)g](x) = \int_{-\infty}^\infty [D'(f)](x,y)g(y) dy =\int_{-\infty}^\infty \delta'(x-y)g(y) dy =(\delta' * g)(x) = g'(x) $$