Proving multivariable chain rule

45 Views Asked by At

Let $f:G\to G'$, $G\subset \Bbb{R^n}$, $G'\subset \Bbb{R^m}$ be differentiable at $x_0 \in G$ and let $g:G' \to \Bbb{R^p}$ be differentiable at $y_0 = f(x_0)\in G'$. Then the composite mapping $g\circ f:G\to \Bbb{R^p}$ is differentiable at $x_0$ and $D_{g\circ f}(x_0)=D_g(f(x_0)\cdot D_f(x_0)$

I'm going over the proof. Let $f$ differentiable at $x_0$ and $g$ differentiable at $y_0 =f(x_0)$. I begin with $$g(f(x_0 +u))=g(f(x_0)+D_f(x_0)u+\vert u\vert\epsilon_1(u))$$

$$=g(y_0) + D_g(y_0)(D_f(x_0)u+\vert u\vert\epsilon_1(u))+\vert h(u)\vert D_g(y_0)(\epsilon_2(h(u)))$$

How do we arrive at the second equality? We apply $g$ to each term of the sum, and somehow $g((D_f(x_0)u)=D_g(y_0)(D_f(x_0)u+\vert u\vert\epsilon_1(u)).$

1

There are 1 best solutions below

0
On BEST ANSWER

From your post, I assume you are ok with the first equality, so we are at

$$ g(f(x_0+u)) = g \Big( f(x_0) + D_f(x_0)u + |u| \varepsilon_1(u) \Big) $$

Then you say:

We apply $g$ to each term in the sum, and somehow ...

We do not apply $g$ to each term in the sum. The next step is actually the same thing that happened in the first step: in the first step, you used the linear approximation of $f$ to say that

$$ f(x_0+u) = f(x_0) + D_f(x_0) \cdot u + |u| \varepsilon_1(u) $$ The next step that you are asking about is doing the same thing, but with $g$. In place of $x_0$ we now have $y_0=f(x_0)$, and in place of $u$, we now have $$ v = D_f(x_0) \cdot u + |u|\varepsilon_1(u) $$ So we get $$ g(y_0 + v) = g(y_0) + D_g(y_0) \cdot v + |v|\varepsilon_2(v) $$ Now just plug in the expression for $v$ and you get what you have in your post. Although on your last line you call it $h(u)$ where I call it $v$.