The proof of the multivariable chain rule (Vector form)

191 Views Asked by At

Let $f$ be a function defined on an open set $U$, and let $X(t)$ be a curve such that $X(t)$ is contained in $U$ for all $t$.

We define the function $g(t)=f(X(t))$, I want to know what is $dg/dt$, this is my attempt : $$\frac{dg}{dt}=\lim_{h \to 0}\frac{f(X(t+h))-f(X(t))}{h}$$ Now let $k=X(t+h)-X(t)\iff X(t+h)=k+X(t)$, $$\frac{dg}{dt}=\lim_{h \to 0}\frac{f(X(t)+k)-f(X(t))}{h}$$ I'm gonna use a trick now : $$\frac{dg}{dt}=\lim_{h \to 0}\frac{f(X(t)+k)-f(X(t))}{h}\frac{k}{k}= \lim_{h \to 0}\frac{f(X(t)+k)-f(X(t))}{k}\frac{k}{h}$$ remmember that as $h\to 0$, $k\to 0$, so we get $$\frac{dg}{dt}=f'(X(t))\cdot X'(t)$$ but this is result is completely false because $X'(t)$ is a curve and $f'(X(t))$ is a function, so multiplying both we get a curve, but $dg/dt$ is a function.

please help me find the mistake.

2

There are 2 best solutions below

0
On

I think one can proceed via generalized Taylor Series. Indeed, suppose $f:U\subset \mathbb{R}^n\rightarrow \mathbb{R}$ is continuously differentiable. Then $f$ admits a taylor series expanded about the point $X(t)$ to arrive at: \begin{equation} f(X(t+h))=f(X(t))+{(X(t+h)-X(t))}^T\cdot Df(X(t))+R(X(t))), \end{equation} where $R(X(t))$ is the remainder term. Plugging in the appropriate expression into the definition of the derivative, we find that: \begin{equation} \begin{split} \frac{dg}{dt}&=\lim_{h\rightarrow 0}\frac{f(X(t+h))-f(X(t))}{h}\\ &=\lim_{h\rightarrow 0}\frac{{(X(t+h)-X(t))}^T\cdot Df(X(t))+R(X(t))}{h}\\ &=\lim_{h\rightarrow 0}\frac{{(X(t+h)-X(t))}^T}{h}\cdot Df(X(t))+\lim_{h\rightarrow 0}\frac{R(X(t))}{h}\\ &=X'(t)^T\cdot Df(X(t)), \end{split} \end{equation} where it's not hard to see that the remainder term vanishes in the limit and one arrives at the desired expression.

0
On

Let me base answer on Vladimir A. Zorich, Mathematical Analysis I, Springer, 2016, page 441, Differentiation of a Composition of Mappings (Chain Rule).

Assuming we have curve $X\colon [a,b]\to \mathbb{R}^n$, for $[a,b]\subset \mathbb{R}$ and function $f\colon \mathbb{R}^n \to \mathbb{R}$ with given in OP properties i.e. exists derivative of composition $g=f \circ X$ for point $y=X(t)$, then it will be calculate by formula

$$g'(t)=(f \circ X)'(t)=\begin{bmatrix} \partial_1 f(y) & \partial_2 f(y) & \cdots & \partial_n f(y) \end{bmatrix}\cdot\begin{bmatrix} \partial_1 X^1(t) \\ \partial_1 X^2(t) \\ \cdots \\ \partial_1 X^n(t) \end{bmatrix} = \sum\limits_{i=1}^{n}\partial_i f(y) \cdot \partial_1 X^i(t)$$