Suppose we have two functions $f,g:\Bbb R\rightarrow \Bbb R$. The chain rule states the following about the derivative of the composition of these functions, namely that $$ (f \circ g)'(x) = f′(g(x))\cdot g′(x). $$ However, the equivalent expression using Leibniz notation seems to be saying something different. I know that $f'(g(x))$ means the derivative of $f$ evaluated at $g(x)$, but when considering the Leibniz equivalent of the chain rule, it appears that it should really mean the derivative of $f$ with respect to $g(x)$. If we let $z=f(y)$ and y=$g(x)$, then $$ {\frac {dz}{dx}}={\frac {dz}{dy}}\cdot {\frac {dy}{dx}}. $$ Where here the $\frac{dz}{dy}$ corresponds to $f'(g(x))$. Since $y=g(x)$, I am tempted to believe that the expression $f'(u)$ means the derivative of $f$ with respect to $u$; it would make sense in this case as we are treating $g(x)$ as the independant variable. This leaves me with the question: does $f'(g(x))$ mean the derivative of $f$ evaluated at $g(x)$, $\frac{df}{dx} \Bigr\rvert_{x = g(x)}$, or the derivative of $f$ with respect to $g(x)$, $\frac{df}{dg(x)}?$
Confusion about chain rule with Leibniz's notation
1.5k Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 3 best solutions below
On
$f'(g(x))$ means the derivative of $f$ evaluated at $g(x)$. Really the ambiguous one is Leibniz notation, because it makes you think the function $f$ "cares" about what is the name of its argument. $f$ is a funcion of one variable, so it can only be differentiated with respect to one thing: its only entry.
On
While the other answers deal with the modern definition of derivatives, it is not actually impossible to make the original Leibniz notation completely rigorous as I sketched here (see "Notes"). In fact, doing so yields a generalization of the usual notion of derivatives (at least for one parameter), as shown by the examples in the linked post.
Furthermore, we can completely explain the error in your reasoning in this framework. $ \def\lfrac#1#2{{\large\frac{#1}{#2}}} $
Take any variables $x,y,z$ varying with parameter $t$ (which may well be $x$ or may be something else we do not care about). Then whenever $\lfrac{dz}{dy},\lfrac{dy}{dx}$ are defined, we have $\lfrac{dz}{dx} = \lfrac{dz}{dy} · \lfrac{dy}{dx}$. If furthermore there are functions $f,g$ such that $z = f(y)$ and $y = g(x)$ everywhere (i.e. for every $t$), then by plain substitution $\lfrac{d(f(g(x)))}{dx} = \lfrac{d(f(y))}{dy} · \lfrac{d(g(x))}{dx}$, which is equivalent to $(f∘g)'(x) = f'(y) · g'(x)$. Since $f'(y) = f'(g(x))$ everywhere, there is nothing wrong here at all!
So what is the error? $f'(u)$ is not "the derivative of $f$ with respect to $u$". That phrase actually does not make sense, because $f$ is a function in the modern sense and does not have any 'independent variable'! Instead, $f'(u) = \lfrac{d(f(u))}{du}$ for every variable $u$ whose value is always in the domain of $f$.
So $f'(g(x))$ is the derivative of $f$ at $g(x)$ but is not what you thought. Your "$\lfrac{df}{dx}|_{x=g(x)}$" does not make sense for two reasons: (1) Leibniz notation cannot be (correctly) mixed with (modern) functions, so "$\lfrac{df}{dx}$" is incorrect; (2) "$x=g(x)$" is meaningless. Instead, $f'(g(x)) = \lfrac{d(f(g(x)))}{d(g(x))}$, exactly in line with the above explanation of the Leibniz chain rule.
By the way, the reason for having variables $x,y,z$ possibly different from the underlying parameter $t$ is that in many applications it is often the case that we are interested in variables that in reality vary with respect to time $t$, but have some relation that does not depend on time, such as here.
In my opinion the usual way of writing the chain rule in Leibniz notation is confusing and, I would say, bad. It's a frequent source of confusion on this website.
The function that is called $z$ on the left is not the same as the function that is called $z$ on the right. In other words, two different functions are being called by the same name. It would be better to give the function on the left its own name, such as $\hat z(x) = z(y(x))$. Then, using Leibniz notation, the chain rule could be written as $\frac{d\hat z}{dx} = \frac{dz}{dy} \frac{dy}{dx}$. This is still a little confusing: $\frac{dz}{dy}$ is to be interpreted as $z'(y(x))$.
In my opinion the notation $$\hat z'(x) = z'(y(x)) y'(x)$$ is far more clear.
To specifically address the final part of your question: $f'(g(x))$ is the derivative of $f$ evaluated at $g(x)$. I would not use the phrase "derivative of $f$ with respect to $g(x)$".
Edit: Here is the thought process behind the Leibniz notation, and an explanation for why it has become so popular despite the fact that I think it's confusing.
Think about the quantity $z(y(x))$, and imagine what happens if $x$ is perturbed by a small amount $\Delta x$. Then the output of $y$ is perturbed by a small amount $\Delta y$, and the output of $z$ is correspondingly perturbed by a small amount $\Delta z$. And we have $$ \frac{\Delta z}{\Delta x} = \frac{\Delta z}{\Delta y} \frac{\Delta y}{\Delta x} $$ The term on the left is approximately $\hat z'(x)$, but you can see the temptation to call it $\frac{dz}{dx}$. The term $\frac{\Delta z}{\Delta y}$ is approximately $z'(y(x))$, but you can see the temptation to call it $\frac{dz}{dy}$. And the term $\frac{\Delta y}{\Delta x}$ is approximately $y'(x)$, and of course you see the temptation to call it $\frac{dy}{dx}$.