Proving $\frac{dy'}{dy} = \frac{y''}{y'}$ in general without abuse of notation

237 Views Asked by At

My understanding of derivatives is that:

$$f'(x) = \lim_{h \to 0} \frac{f(x + h) - f(x)}{h}$$

Where the limit is defined with the usual $\epsilon$-$\delta$ style statement with first order logic.

And so $\frac{df(x)}{dx} = f'(x)$, as per usual.

This doesn't work so well when people start talking about $\frac{df'(x)}{df(x)}$.

In the special case where $y = f(x)$ is invertible we can rephrase this with the chain rule :

Let $g(y) = f^{-1}(y) = x$, then $\frac{df'(x)}{df(x)}$ is:

$$\begin{align*} \frac{d}{dy}\left( f'(g(y)) \right) &= f''(g(y)) \cdot \frac{d}{dy}( g(y)) \\ &= f''(x) \cdot g'(y) \\ &= f''(x) \cdot g'(f(x)) \\ &= f''(x) \cdot \left( \frac{d}{dx}\left(g(f(x))\right) \cdot \frac{1}{f'(x)} \right) \text{ by chain rule } g'(h(x)) = \frac{d}{dx}( g(h(x)) ) \cdot \frac{1}{h'(x)} \\ &= f''(x) \cdot \left( \frac{d}{dx}\left( x\right) \cdot \frac{1}{f'(x)} \right) \text{ since $g(f(x)) = f^{-1}(f(x))$}\\ &= \frac{f''(x)}{f'(x)} \end{align*}$$

It is asserted by many on MSE that: $\frac{df'(x)}{df(x)} = \frac{f''(x)}{f'(x)}$ in general.

However I can't seem to make sense of this in terms of the usual $\epsilon-\delta$ definitions.


This leads me to think that there are multiple notions of derivatives:

  • The ordinary derivative defined with $\epsilon-\delta$.
  • The notion of a differential, which builds on top of the ordinary derivative.
    • Here $df_x(t) = f'(x) \cdot t$, where $d$ operates on a function. Thus being capable of proving the above derivative trivially: $$\frac{df'_x(t)}{df_x(t)} = \frac{f''(x)}{f'(x)}$$

    • I suspect this is what many of the answers are using, and this what people mean when they say "using Leibniz notation"?


My question is the following:

  • Is it possible to prove the general case without using the notion of differentials?
  • Is it wrong in thinking there are multiple notions of derivatives and that "differentiating with respect to a function" is not the same thing as the ordinary $\epsilon-\delta$-based derivative?

Edit: Here are some MSE answers which claim this is true in general:

  1. Why is $\frac{dy'}{dy}$ zero, since y' depends on y?
  2. Simplifying $\frac{dy'}{dy}$ where $y=f(x)$
  3. Derivative of a function with respect to another function.
  4. differentiate with respect to a function
  5. What is $\frac{d}{dx}\left(\frac{dx}{dt}\right)$?
  6. Circular Motion
  7. Showing $\ddot{x} = \frac{\mathrm{d}}{\mathrm{d}x}(\frac{1}{2} \dot{x}^2)$
  8. Is there a way to rigorously define "taking the derivative with respect to a function"
  9. Derivative with respect to another function
  10. Taking a derivative of a function with respect to another function
3

There are 3 best solutions below

2
On BEST ANSWER

In the first place, what does $\frac{\mathrm{d} f' (x)}{\mathrm{d} f (x)}$ mean? Clearly, it suffices to define what $\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)}$ means: once we know that, we can simply substitute $f'$ for $g$. We should also define it in such a way that when $f$ is the identity function – i.e. $f (x) = x$ – then $\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)}$ has the same meaning as $\frac{\mathrm{d} g (x)}{\mathrm{d} x}$. There is an obvious choice: $$\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)} = \lim_{h \to 0} \frac{g (x + h) - g (x)}{f (x + h) - f (x)}$$ Since $f (x + h) - f (x)$ could be $0$ for $h \ne 0$ we should be a little bit more careful, so let us say that the value of $\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)}$ at $x = x_0$ is $M$ if, for all $\epsilon > 0$, there exists $\delta > 0$ such that for all $h$ such that $0 < \left| h \right| < \delta$, $$\left| g (x_0 + h) - g (x_0) - M \cdot (f (x_0 + h) - f (x_0)) \right| < \epsilon \cdot \left| f (x_0 + h) - f (x_0) \right|$$

(This definition generalises straightforwardly to the vector-valued multivariable case, provided we understand $M$ needs to be a matrix of the appropriate dimensions.) Since both the left and right hand side are non-negative, if $\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)}$ has a value at $x = x_0$, then there exists $\delta > 0$ such that for all $h$ such that $0 < \left| h \right| < \delta$, $\left| f (x_0 + h) - f (x_0) \right| > 0$, i.e. $f$ is not constant on any neighbourhood of $x_0$.

Now, with all that preamble out of the way, let me state:

Theorem. If $f (x)$ and $g (x)$ are differentiable at $x = x_0$, with $f' (x_0)$ and $g' (x_0)$ as the values of $\frac{\mathrm{d} f (x)}{\mathrm{d} x}$ and $\frac{\mathrm{d} g (x)}{\mathrm{d} x}$ at $x = x_0$ respectively, and $f' (x_0) \ne 0$, then $\frac{\mathrm{d} g (x)}{\mathrm{d} f (x)}$ has value $\frac{g' (x_0)}{f' (x_0)}$ at $x = x_0$.

Proof. Let $0 < \epsilon < 1$. By hypothesis, there exists $\delta_1 > 0$ such that for all $h$ such that $0 < \left| h \right| < \delta_1$, $$\left| f (x_0 + h) - f (x_0) - f' (x_0) \cdot h \right| < \frac{1}{3} \epsilon \cdot \left| h \right| \cdot \frac{\min \left\{ \left| f' (x_0) \right|, \left| f' (x_0) \right|^2 \right\}}{\max \left\{ 1, \left| g' (x_0) \right| \right\}}$$ (Replace $\epsilon$ with $\frac{1}{3} \epsilon \cdot \frac{\min \left\{ \left| f' (x_0) \right|, \left| f' (x_0) \right|^2 \right\}}{\max \left\{ 1, \left| g' (x_0) \right| \right\}}$ in the definition.) We then have: $$\left| f (x_0 + h) - f (x_0) - f' (x_0) \cdot h \right| < \frac{1}{3} \left| f' (x_0) \cdot h \right|$$ $$\left| \frac{g' (x_0)}{f' (x_0)} \right| \cdot \left| f (x_0 + h) - f (x_0) - f' (x_0) \cdot h \right| < \frac{1}{3} \epsilon \cdot \left| f' (x_0) \cdot h \right|$$

Similarly, by hypothesis, there exists $\delta_2 > 0$ such that for all $h$ such that $0 < \left| h \right| < \delta_2$, $$\left| g (x_0 + h) - g (x_0) - g' (x_0) \cdot h \right| < \frac{1}{3} \epsilon \cdot \left| f' (x_0) \cdot h \right|$$ (Replace $\epsilon$ with $\frac{1}{3} \epsilon \cdot \left| f' (x_0) \right|$ in the definition.)

Let $\delta = \min \{ \delta_1, \delta_2, 1 \}$. Then, for all $h$ such that $0 < \left| h \right| < \delta$: $$\begin{multline} \left| g (x_0 + h) - g (x_0) - \frac{g' (x_0)}{f' (x_0)} \cdot ( f (x_0 + h) - f (x_0) ) \right| \\ \le \left| g (x_0 + h) - g (x_0) - \frac{g' (x_0)}{f' (x_0)} \cdot f' (x_0) \cdot h \right| + \left| \frac{g' (x_0)}{f' (x_0)} \cdot ( f (x_0 + h) - f (x_0) - f' (x_0) \cdot h ) \right| \end{multline}$$ The first term is $< \frac{1}{3} \epsilon \cdot \left| f' (x_0) \cdot h \right|$. The second term is also $< \frac{1}{3} \epsilon \cdot \left| f' (x_0) \cdot h \right|$. Thus the LHS is $< \frac{2}{3} \epsilon \cdot \left| f' (x_0) \cdot h \right|$. But, $$\begin{multline} \left| f' (x_0) \cdot h \right| \le \left| f (x_0 + h) - f (x_0) \right| + \left| f (x_0 + h) - f (x_0) - f' (x_0) \cdot h \right| \\ < \left| f (x_0 + h) - f (x_0) \right| + \frac{1}{3} \left| f' (x_0) \cdot h \right| \end{multline}$$ so $\left| f' (x_0) \cdot h \right| < \frac{3}{2} \left| f (x_0 + h) - f (x_0) \right|$. Therefore, $$\left| g (x_0 + h) - g (x_0) - \frac{g' (x_0)}{f' (x_0)} \cdot ( f (x_0 + h) - f (x_0) ) \right| < \epsilon \cdot \left| f (x_0 + h) - f (x_0) \right|$$ as required. ◼


I also beg to differ with those who say that things like "independent variable" are meaningless conversational filler. Pure mathematicians – some of us anyway – know how to make this rigorous. The trick is to recognise the concept of context and make it a concrete thing. In probability theory, this is the purpose of the sample space. We can do the same for basic analysis... but this kind of formalisation is usually not helpful for early students, so we do not teach it.

5
On

If $(u,v) $ are functions of $x$, we can even bring in another new independent variable $t$

$$\frac{du}{dv} = \frac{du/dt}{dv/dt}$$

or continue, permitted by Leibnitz validity with the old independent variable $x$ itself by letting

$$ u=f'(x), v= f(x) $$

$$\frac{df'(x)}{df(x)} = \frac{\dfrac{df'(x)}{dt}}{\dfrac{df(x)}{dt}}== \frac{\dfrac{df'(x)}{dx}}{\dfrac{df(x)}{dx}}=\frac{f''(x)}{f'(x)} $$

You can even introduce a third independent variable if you wish to.

Btw, using the L'Hôpital Rule or the Quotient Rule if LHS is constant then the RHS would also be the same constant.

0
On

Let us operate with the usual $\epsilon-\delta$ definition of derivative. It is standard within this framework to prove (without any handwaving) the usual "rules of differentiation": Linearity, Product Rule, Chain Rule.

Suppose then we have open intervals $X,Y$ and twice-differentiable functions $f:X\to Y$, $g:Y\to X$ satisfying $f\circ g=I_Y$ and $g\circ f=I_X$ where $I_X, I_Y$ are the identity maps on $X$, $Y$. Suppose that $f'\not=0$ on $X$ and $g'\not=0$ on $Y$.

We then have by applying the Chain Rule to $f\circ g=I_Y$ that $$ (f'\circ g) g'=1, \tag{*} $$ where $1$ denotes the constant function whose value is always $1$.

From $(*)$ we get using the Product Rule that $$ (f'\circ g)'g'+(f'\circ g)g''=0 $$ which we can rewrite using $(*)$ as $$ (f'\circ g)'=-\frac{g''}{(g')^2}.\tag{1} $$

But we also have from $(*)$ using the Chain Rule and the Product Rule that $$ (f''\circ g) (g')^2+(f'\circ g)g''=0 $$ which we can rewrite as $$ \frac{g''}{(g')^2}=-\frac{(f''\circ g)}{(f'\circ g)}.\tag{2} $$

From (1) and (2) we then have $$ (f'\circ g)'=\frac{(f''\circ g)}{(f'\circ g)}. \tag{3} $$

Let us now re-write this in old-fashioned language, evaluating each side at the point $y\in Y$, and writing $x=g(y)$ (so that $y=f(x)$).

The right hand side is clearly just $\frac{f''(x)}{f'(x)}$.

The left hand side can be re-written, if we allow ourselves to abuse notation, as follows. The outermost derivative is with respect to $y=f(x)$: so write it as $\frac{d}{df}$. The innermost derivative is with respect to $x$, so let's continue to write it as $f'$. Putting this together we have $$ \frac{d f'(x)}{df}=\frac{f''(x)}{f'(x)}. $$