Chain rule with intermediate vector function

111 Views Asked by At

Imagine we have two functions: $f:R^n \to R$, and $g: R \to R^n$. We want to differentiate their composition: $f(g(x))$. I want to do it in the matrix form. If I do so naively, I get nonsense: $$ \frac{\partial f(g(x))}{\partial x} = \frac{\partial f}{\partial v} \frac{\partial g}{\partial x} $$ The first one is a row vector, the second is also a row vector. We can't multiply them. Where's the mistake? How is it possible to fix it?

2

There are 2 best solutions below

1
On BEST ANSWER

For $g:\>{\mathbb R}\to{\mathbb R}^n$ the matrix $[dg(t)]$ is a column vector, and for $f:\>{\mathbb R}^n\to{\mathbb R}$ the matrix $[df(x)]$ is a row vector. It follows that the composed map $\phi:=f\circ g:\>{\mathbb R}\to{\mathbb R}$ has derivative $d\phi(t)=df\bigl(g(t)\bigr)\circ dg(t)$ with $1\times1$-matrix $$[\phi'(t)]=\bigl[df\bigl(g(t)\bigr)\bigr]\,[dg(t)]\ ,$$ which works perfectly well in matrix terms.

3
On

The correct formula is \begin{align*} \dfrac{d(f\circ g)(x)}{dx}=\sum_{i=1}^{n}\dfrac{\partial f}{\partial v_{i}}(g(x))\dfrac{dg_{i}(x)}{dx}, \end{align*} where $g=(g_{1},...,g_{n})$.