Proof of multivariable chain rule

5.4k Views Asked by At

I'm working with a proof of the multivariable chain rule $\displaystyle{\frac{d}{dt}g(t)=\frac{df}{dx_1}\frac{dx_1}{dt}+\frac{df}{dx_2}\frac{dx_2}{dt}}$ for $g(t)=f(x_1(t),x_2(t))$, but I have a hard time understanding two important steps of this proof.

The proof includes the function $\displaystyle{\Delta_i(h)=x_i(t+h)-x_i(t)}$ for $\displaystyle{i=1,2, \bar{\Delta}=(\Delta_1(h),\Delta_2(h)) \Rightarrow \lim_{h\rightarrow0}\frac{\Delta_i}{h}=x^{'}_i}$. It says that

$\frac{g(t+h)-g(t)}{h}=\frac{f(\bar{x}(t+h))-f(\bar{x}(t))}{h}=\frac{f(\bar{x}(t)-\bar{\Delta(h))}-f(\bar{x}(t))}{h}$

which I understand, but the next step is the to state that $f$ is differentiable and then let the previous equation be equal to

$=f^{'}_1(\bar{x}(t))\cdot\Delta_1(h)+f^{'}_2(\bar{x}(t))\cdot\Delta_2(h)+o(\vert\vert\bar{\Delta}\vert\vert)$

and this step I do not understand. I think there might be missing some limit-notation? But even with the limit notation I'm still not sure as to how it becomes a partial derivative multiplied with $\Delta_i$.

Afterwards they let $h\rightarrow0$ to get

$=f^{'}_1(\bar{x}(t))\cdot x_1^{'}(t)+f^{'}_2(\bar{x}(t))\cdot x_2^{'}(t)$

Again I am very confused as to possibly missing limit notations.

Does anyone know this version of the proof of the chain rule (besides these two steps, I find it the easiest version to understand), or understand these steps?

Here are pictures of the notes:

Theorem: Multivariable chain rule

Proof of theorem

1

There are 1 best solutions below

7
On BEST ANSWER

Multivariable chain rule descends from the theorem of composite function for function of several variables which states in general that if:

f and g are differentiable in $x_0$ and $y_0=f(x_0)$, that is:

$$f(x_0+h)=f(x_0)+J_f(x_0)\cdot h+o(|h|)$$

$$g(y_0+k)=g(y_0)+J_g(y_0)\cdot k+o(|k|)$$

The composite function $g \circ f$ is also differentiable in $x_0$ and:

$$g(f(x_0+h))=g(f(x_0))+J_g(y_0)\cdot J_f(x_0)\cdot h+o(|h|)$$

NOTE

For the proof it is convenient to write:

$o(|h|)=|h|\cdot \omega_f(h)$ with $\omega_f(h) \to 0$

$o(|k|)=|k|\cdot \omega_g(k)$ with $\omega_g(k) \to 0$.

In the special case of:

$$f: \mathbb{R} \to \mathbb{R^n} \quad t \to f(x_1(t),x_2(t),...,x_n(t))$$

$$g: \mathbb{R^n} \to \mathbb{R}$$

$$\phi=g\circ f: \mathbb{R} \to \mathbb{R}$$

we have

$$J_f(t)= \begin{bmatrix} \frac{dx_1}{d t} \\ .\\\frac{dx_n}{d t} \end{bmatrix} = \begin{bmatrix} x_1' \\ .\\ x_n' \end{bmatrix}$$

$$J_g(x)= \nabla g = \left( \frac{\partial g}{\partial x_1},...,\frac{\partial g}{\partial x_n} \right)$$

And finally:

$$J_g(x)\cdot J_f(t)=\frac{\partial g}{\partial x_1}\frac{dx_1}{dt}+...+\frac{\partial g}{\partial x_n}\frac{dx_n}{dt}$$

That is the chain rule for this particular case.

Take also a look here: Derivation of the multivariate chain rule

The general theorem allow to find similar rules for any case by the Jacobian matrices $J_f$ and $J_g$.