How does the chain rule work for functions from vectors to vectors?

4.2k Views Asked by At

Suppose I have a function:

$$ \vec{s} = \vec{f}\left(\vec{\theta}\right)$$

and a derivative:

$$ \vec{v} = \frac{\mathrm{d} \vec{s}}{\mathrm{d} t}$$

How do I apply the chain rule?

For simplicity lets call $\omega = \frac{\mathrm{d} \vec{\theta}}{\mathrm{d} t}$

I think the chain rule should be something along the line of:

$$ \vec{v} = \vec{\omega} \times \nabla_{\theta} \vec{f}\left(\theta\right) $$

but I don't know the exact rule.

I think I may have to use matrices and more complicated derivatives like the Jacobian.

4

There are 4 best solutions below

2
On

Ok then, $\theta:R\rightarrow R^n$ and $f:R^n\rightarrow R^p$ so $f\circ \theta:R\rightarrow R^p.$ Then it's true the derivative has to be a vector and it is precisely the vector of $R^p$ that in the $ith$ component has $\nabla F_i(\theta(t))\cdot \theta'(t)$, where the $F_i:R^n\rightarrow R, (i=\overline{1, p})$ are the components of $f$.

3
On

It's actually the definition of the gradient that $$\langle \nabla f(p), X\rangle = df(p) X$$ so $$\frac{d}{dt}f\circ \theta(t) = df(\theta)\theta^\prime = \langle \nabla f(p), \theta^\prime\rangle$$ where $\theta^\prime=\frac{d}{dt}\theta$.

4
On

Let $f: D \subset \mathbb R^n \to \mathbb R$ be a scalar field defined in an open ball about $a$ and let $x: I \subset \mathbb R \to \mathbb R^n $ be a vector-valued function defined in an open interval about $t_0$. Let $x(t_0) = a$ and $x(I) \subset D$. If $f$ is differentiable at $a$ and if $x$ is differentiable at $t_0$, then $f(x)$ is differentiable at $t_0$ and its derivative is given by :

$\frac{d}{dt}f(x(t_0)) = \nabla f(x(t_0))x'(t_0)$

We can write this formula in scalar form as : $\frac{d}{dt}(f o x) = \frac{\partial f}{\partial x_1}\frac{dx_1}{dt} + ...+ \frac{\partial f}{\partial x_n}\frac{dx_n}{dt} $.

You can also write that in matrix form and then, recalling the definition of the Jacobian matrix $Df$ the latter can be recognized as :

$\frac{d}{dt}(f(x(t))=Df(x(t))Dx(t)$.

0
On

Consider a particular component $v_k$ of the velocity vector $\vec{v}(\vec{\theta})$. Then the chain rule yields

$$\displaystyle v_k = \frac{ds_k}{dt}= \sum_{j=1}^3 \frac{\partial s_k}{\partial \theta_j}\frac{d\theta_j}{dt} =\sum_{j=1}^3 \frac{\partial s_k}{\partial \theta_j}\omega_j.$$ If we understand $\nabla_{\vec{\theta}} F=\sum_{j=1}^3 \frac{\partial F}{\partial \theta_j}\hat{e}_k,$ the above may be written as $v_k=\nabla_{\vec{\theta}}s_k\cdot \vec{\omega}$ i.e. $\vec{v}=\nabla_{\vec{\theta}}\vec{s}\cdot \vec{\omega}.$ This reflects the fact that the first term is a rank 2 tensor, and so the use of the dot product (instead of the cross product) is warranted in order to obtain a vector.