Chain Rule Intuition

10.4k Views Asked by At

We know that the chain rule is used to differentiate a composite function ,say $$f(x) = h(g(x))$$ It's defined as the derivative of the outside function times the derivative of the inner function or the other way around.

$$\frac{\mathrm{dy} }{\mathrm{d} x} = \frac{\mathrm{dy} }{\mathrm{d} u} \cdot \frac{\mathrm{du} }{\mathrm{d} x}$$

Despite we know that the above expression is not a fraction (even though it's a fractional notation of the derivative used by Leibnitz) you can "cancel" the two du's and get back dy/dx.

My question is: How can you even think of cancelling du from dy/du and du from du/dx when they are not even fractions. Just because it's been multiplied do they automatically become fractions?Are they really being "multiplied"?

I'am really looking for an intuition behind this.To me this is some kind of fantasy.It doesn't appear to be real.

6

There are 6 best solutions below

3
On BEST ANSWER

In a race, Usain Bolt is travelling twice as fast as a train which is going 3 times as fast as a horse. How much faster is Usain Bolt travelling than the horse?

$$ \frac{d\text{Bolt}}{d\text{Horse}}= \frac{d\text{Bolt}}{d\text{Train}} \cdot \frac{d\text{Train}}{d\text{Horse}} = 2\cdot 3 = 6 $$

1
On

If $h$ and $g$ are linear functions, then it should be obvious what the chain rule must hold. The general chain rule is simply this observation plus the fact that derivatives provide good linear approximations.

0
On

Think of the plot of $f(x)$ and $f(3x)$. It is clear that the slope in the second case is 3 times as large. Now, change the factor at each point...

2
On

Here's the intuition I give every time I teach the Chain Rule:

Remember that derivatives are rates, the Chain Rule explains how to meaningfully multiply these rates together. A cheetah is 4 times as fast as a man, and a man is 10 times as fast as a snail. You can see right away how to compare the cheetah to the snail-- the cheetah is 40 (that is, 4x10) times as fast.

The Chain Rule is just the formula for computing more difficult derivatives by using an intermediate step. We have $y=f(x)$, and we can get the rate of change of $y$ with respect to $x$ by going through an intermediate variable $u=g(x)$ (where $f(x)=h(g(x))$). We get $$f'(x)= h'(g(x)) \, g'(x)$$ or, equivalently, $$ \frac{dy}{dx} = \frac{dy}{du} \frac{du}{dx}.$$


The above is just a quick intuitive explanation for why the Chain Rule involves multiplying derivatives and "canceling." Rahul's answer explains a proof for this fact.

Having proofs is essential, because sometimes derivatives may not work as you expect. For example, if you have $z=f(x,y)$ where $x$ and $y$ are both functions of $t$, the Chain Rule looks like $$\frac{dz}{dt} = \frac{\partial z}{\partial x} \frac{dx}{dt} + \frac{\partial z}{\partial y} \frac{dy}{dt},$$ which is not the same as ordinary fraction cancelling.

0
On

$$ \frac{dy}{dx} = \lim_{\Delta x\to 0}\frac{\Delta y}{\Delta x} = \lim_{\Delta x\to0} \frac{\Delta y}{\Delta u} \cdot\frac{\Delta u}{\Delta x}. $$ When you write that, then it's just cancellation.

(Ordinary "limit laws" will get you from there to $\left(\lim\limits_{\Delta x\to0}\dfrac{\Delta y}{\Delta u} \right)\cdot \left(\lim\limits_{\Delta x\to0} \dfrac{\Delta u}{\Delta x}\right)$, but notice that the first limit here says $\text{“}\Delta x\to 0\text{,''}$ not $\text{“}\Delta u \to0.\text{''}$ That you can put $\Delta u$ there depends on the fact that differentiable functions are continuous. Then there's a moderately hairy difficulty of what to do when $\Delta u=0$ and $\Delta x\ne 0$.)

0
On

If you look at $f$ as a function of $g$, then you'd get that the differential of $f$ is $df=f'(g)dg$.

Now do the same thing for $g$ by considering that is is a function of $x$: $dg=g'(x)dx$.

Hence $df=f'(g)dg=f'(g(x))g'(x)dx$, or, $\frac{df}{dx}=f'(g(x))=f'(g(x))g'(x)$.

This is what we are doing when using a u-sub in integration: $$\int f'(g(x))g'(x)dx=\int f'(g)dg$$