understanding chain rule and differentials

137 Views Asked by At

Ok the chain rule for single variable calculus is as follow: if $ f:I \to \mathcal{R}$ is differentiable at $d \in I$, $f(I) \subset J $and $g: J \to \mathcal{R}$ differentiable at $f(d)$. then $g\circ f(x)$ is differentiable at $d$ and the derivative is $g'(f(d))f'(d)$.

I memorized the proof that uses the caratheodory theorem. But I really still don't understand why we need to go such a trouble to prove this theorem.

I really don't understand what is this $dx,dy$ and those kind of stuff. like for example, why can you treat them like numbers and do $dy= y'(x)dx$ and then substitute that into an integral. I thought $\frac{dy}{dx}$as whole is a symbole.

I do know the epsilon delta definition of continuity, differentiability.

This just limit my understanding in multivariable calculus very much as well. Like when they say that $\frac{df(x(t),y(t))}{dt}=f_x\frac{dx}{dt}+f_y\frac{dy}{dt}$. I don't understand it even though i know how to compute it.

DO you see my problem? Can you please enlighten me?

2

There are 2 best solutions below

0
On BEST ANSWER

I suspect you are just getting lost in the collection of overused symbols $x,y,f$.

The chain rule for $g \circ f$ is $D (g \circ f) (x) = Dg(f(x))Df(x)$.

Suppose $g(t) = \begin{bmatrix} x(t) \\ y(t) \end{bmatrix}$, then $Dg(t) = \begin{bmatrix} {\partial x(t) \over \partial t} \\ {\partial y(t) \over \partial t} \end{bmatrix}$ and you have $Df(z) = \begin{bmatrix} {\partial f(z) \over \partial z_1} &{\partial f(z) \over \partial z_2} \end{bmatrix}$

so $D (f \circ g) (t) = Df(g(t)) Dg(t) = \begin{bmatrix} {\partial f(\begin{bmatrix} x(t) \\ y(t) \end{bmatrix}) \over \partial z_1} & {\partial f(\begin{bmatrix} x(t) \\ y(t) \end{bmatrix}) \over \partial z_2} \end{bmatrix} \begin{bmatrix} {\partial x(t) \over \partial t} \\ {\partial y(t) \over \partial t} \end{bmatrix} = {\partial f(\begin{bmatrix} x(t) \\ y(t) \end{bmatrix}) \over \partial z_1} {\partial x(t) \over \partial t} + {\partial f(\begin{bmatrix} x(t) \\ y(t) \end{bmatrix}) \over \partial z_2} {\partial y(t) \over \partial t} $

0
On

I take the differential of $y= f(x)$ to be $dy = f'(x) \Delta x$.

The differential operator takes two variables as inputs : the variable of the original function, and an increment of this variable.

In case function $f$ is the identity function , we have $f(x)=x$ , and the differential of the output is : $dx = [x]' \Delta x = (1) \Delta x= \Delta x$. Which means that $ \Delta x = dx$ , and allows to rephrase the definition :

$ dy = f'(x)dx$.

Note : this definition provides a new way to write the derivative of a function with respect to its variable :

$f'(x)= dy/ dx$.


Suppose that $z$ is a function $f$ of $y$ , which, in turn is a function $g$ of $x$.Meaning that $z = f(g(x))$.

For example $y = x²+1$ and $z = \sqrt {y} $.

We have by definition of " differential" ,

(1) $ dz = z' dy$

and

(2) $ dy = y' dx$.

Replacing $dy$ by its value in (1) , we get :

(3) $ dz = z' \space y' \space dx$.

But ( by dividing (1) by $dy$ and (2) by $dx$) : $ z' = \frac {dz} {dy}$ and $ y' = \frac {dy} {dx}$.

Substituting in (3), we get :

(4) $ dz = \frac {dz} {dy} \space \frac {dy} {dx}\space dx$.

Dividing by $dx$ we get the chain rule.