Composition of multivariate functions.

925 Views Asked by At

How can I understand a theorem about the composition of two functions:

$g = \gamma: I \subset \mathbb{R} \rightarrow \mathbb{R}^m$

$f: \mathbb{R}^m \rightarrow \mathbb{R}$

if it looks like this:

$d(f \circ \gamma)(t_0) = df(\gamma(t_0))d\gamma(t_0)=\langle\nabla f(\gamma(t_0)), \gamma^{\prime}(t_0)\rangle$

Isn't $d(f \circ \gamma)(t_0)$ just another way to write $df(\gamma(t_0))$? How can I understand this notation?

2

There are 2 best solutions below

6
On BEST ANSWER

No. For a function $f:\mathbb R^n \to \mathbb R^m$, the differential takes in two inputs $$df:\mathbb R^n \times \mathbb R^n \to \mathbb R^m$$ It's linear in the second, but not normally linear in the first.

$$d(f\circ \gamma)(t_0) $$ Is the differential of $f\circ \gamma$ evaluated at $t_0$. One argument remains unevaluated, the result is a linear map $\mathbb R\to\mathbb R^m$.

$$df(\gamma(t_0))$$ Is the differential of $f$ evaluated at $\gamma(t_0)$. Again this is a linear map. But it takes in a vector. This vector would be the result of $d\gamma(t_0)$ applied to a real number, but this is not yet given. The final result $d(f\circ \gamma)(t_0) d\gamma(t_0) $ is a linear map $\mathbb R \to \mathbb R^m$.

The final equality is by a basic version of Riesz Representation - the linear map is the same as matrix multiplication with a certain row vector, and that vector is written out for you there.

PS in dimension one, $df(x)(h)$ is the multiplication of real numbers $f'(x)h. $ Here it's clear that the two terms you are asking about is different, and the stated equality is a generalisation of chain rule from highschool.

1
On

Now, I think I get where maybe your confusions lies. As pointed out in the comment above, more brackets might be helpful.

So, since $(f \circ \gamma)(t_0) = (f(\gamma(t_0))$ is a composition of two functions, when taking the derivative you need the chain rule. Now you need to be aware of what each $d$ refers two. In the expression $d(f\circ \gamma)(t_0) = d(f(\gamma(t_0)))$ the differential is still to be taking with the chain rule which gives $$ d(f \circ \gamma)(t_0) = (df)(\gamma (t_0)) \circ (d\gamma)(t_0), $$ where now the differentials (or total derivatives) are only with respect to one function. So $(df)(\gamma(t_0)$ is to read as, the differential $df$ evaluated at point $\gamma(t_0)$.

Now, since we are in $\mathbb{R}$ or $\mathbb{R}^n$ respectively, (and everything is smooth enough, e.g. the total derivatives exists, I assume) you can identify the differential with the jacobian matrix, where the composition $\circ$ becomes a Matrix Matrix multiplication - which in your case you expressed it with a scalar product.

Was this related to your question?