I've always wondered why the numerator is $d^n$ while the denominator is $dx^n$ instead of $d^nx$ like the numerator.
I must be missing something very obvious or fundamental. Is this notation derived from the chain rule in some way?
I've always wondered why the numerator is $d^n$ while the denominator is $dx^n$ instead of $d^nx$ like the numerator.
I must be missing something very obvious or fundamental. Is this notation derived from the chain rule in some way?
On
The following is not an explanation but a tip to remember which one is correct. Think about physics. Let, say, y be measured in meters and x in seconds. The first derivative is in meters by second. The second derivative is in meters by second by second. And so on. Hence the denominator must be an x-type variable (dx, in seconds) raised to the power n.
Now for a pseudo-explanation: discretize your function with step $\epsilon=dx$, I mean let $u_k=y(x+k\epsilon)$. Let the difference operator be $\Delta: u\mapsto v$ with $v_k = u_{k+1}-u_k$. Let $\Delta^n$ mean the operator $\Delta$ applied $n$ times. For instance if $w=\Delta^2 u$ then $w_k=(u_{k+2}-u_{k+1})-(u_{k+1}-u_k)=u_{k+2}-2u_{k+1}+u_k$. Then the $n$-th derivative of $y$ at the point $x$ coincides with the limit $(\Delta^n u)_{0} / \epsilon^n$ where the index $0$ means evaluated at $k=0$.
On
The $dx^n$ versus $d^nx$ confusion can be resolved by thinking of $dx$ as the name of a single variable. You're confusing it with something like: $$ 3x^5 $$ where the order of operations is: "raise $x$ to the power of $5$, then multiply it by $3$". Note that $dx \neq (d)(x)$ and $dx \neq d \cdot x$. So in fact there is only one operation in the expression: $$ dx^n $$ Namely: "raise the single variable $dx$ to the power of $n$".
First, in one sense it's arbitrary because it's notation. But the notation makes sense, because the derivative is an operator, and we write it as $\frac d {dx} (f(x))$.
Now, if we take the second devivative, we add an additional copy of the operator, so we have $\frac d {dx}(\frac d {dx} (f(x)))$. Commuting the order we have these operators in, we can write this as $(\frac d {dx})^2(f(x))$.
Now, if you pretend like the derivative operator is a fraction and treat it as such (like we usually do by abuse of notation, and since just about everything we do doing that is backed up by theory), if you just looked at the operator as a fraction and commute the squaring, we have $(\frac d {dx})^2=\frac {(d^2)} {(dx)^2}$. The paranethesis on the top are unnecessary, and because mathematicians are lazy, we abuse notation further and drop the parenthesis on the bottom as well, leading us to the second derivative being $\frac {d^2} {dx^2}$. Repeat iteratively for the nth derivive.
That's the long version. The short version is, it's convention, it's short to write, and it works.