Total derivative for functions with higher dimensional codomain.

63 Views Asked by At

I've been learning some multivariable calculus and I ran into the concept of total derivative. I think I grasp the idea and I've seen several examples of how to calculate it for functions $\mathbb{R}^n \rightarrow \mathbb{R}$, but I don't quite see how it works when the codomain is different than $\mathbb{R}$. For example, say we have a function $f: \mathbb{R}^3 \rightarrow \mathbb{R}^2$ given by:

$$f(x,y,z) = (x+y, z)$$

What would be the total derivative? Would it be in the shape of a matrix, i.e. something like the Jacobian? Browsing some other posts like Relating the total derivative and the Jacobian matrix (albeit still with codomain $\mathbb{R}$) they're not quite the same thing, so would it just be applying the Jacobian to a random element of $\mathbb{R}^3$ like in the reply to that post?

As a follow up, for the one dimensional case I've seen total derivatives calculated when the variables are all a function of some parameter, how would that work in the multivariable case? Picking the example above, lets say we have a function $g: \mathbb{R}^2 \rightarrow \mathbb{R}$ given by $g(x,y)=xy$. What would be the total derivative of $f(x,y,g(x,y))$ in terms of $x$?

I've tried to find sources online, but most stick with the 1 dimensional codomain, so if you have any sources that would help clarify this for me I would also appreciate it. Thanks!

1

There are 1 best solutions below

5
On BEST ANSWER

There is no "total derivative". What you call "total" derivative is just the derivative of a function between to vector spaces of arbitrary dimension, what is normally called simply $df$. Let me clarify what I mean.
Consider a function $f:\mathbb{R}^M \longrightarrow \mathbb{R}^N$. We define the derivative, $df(a)$, to be the unique linear map $df(a): \mathbb{R}^M \longrightarrow \mathbb{R}^N$ such that $$f(a+h)-f(a)=df(a)[h]+o(|h|)$$ for every $h \in \mathbb{R}^M$. This essencially means that $df(a)$ is the best linear approximation of $f$ at $a$. Why do I call this the derivative? Because the definition of derivative is, always, that of the best linear approximation of a function at a point. For real functions of real variables, this is obscured by the fact that the linear maps are just scalar multiplications, so the derivative ends up being almost indistinguishable from a number.
How does this relate to the Jacobian? Well, you learned, hopefully, in Linear Algebra, that linear maps can be represented by matrices, with a suitable choice of basis. The jacobian matrix is just the matrix representing $df(a)$ when you choose the canonical basis of $\mathbb{R}^M$.
So, taking your first example, its derivative, or what you called total derivative, is the map that does this: $$df(a)[(h_1,h_2,h_3)]=(h_1+h_2,h_3)$$ You'll notice that this is just the function $f$; well of course it is! The function was linear to begin with; of course the best linear approximation will be itself. You can also choose the canonical basis and write the Jacobian matrix : $$\begin{pmatrix} 1 & 1&0 \\ 0&0& 1 \end{pmatrix}$$
As for your other function, let's go through it. Here we have to tread carefuly. After you compose $f$ with $g$, your new function $f^*=f \circ G$ is no longer a function from $\mathbb{R}^3$ to $\mathbb{R}^2$, but from $\mathbb{R}^2$ to $\mathbb{R}^2$ ($G(x,y)=(x,y,g(x,y))$ ). The answer to your question lies in the chain rule; what does it tell us? It tells us that the derivative of the composition of two functions is the composition of their derivatives. So the total derivative of $f \circ G$ is $df dG[h]$. This is better calculated in terms of matrices. The Jacobian of G is $$\begin{pmatrix} 1&0\\0&1\\ y&x \end{pmatrix}$$ After composing both matrices you get $$\begin{pmatrix} 1& 1 \\ y &x \end{pmatrix}$$ That is to say, $$d(f \circ G)(x,y)[(h_1,h_2)]=(h_1+h_2,xh_1+yh_2)$$
If you instead mean the total derivative with respect to $x$, like in $\frac{d}{dx}$that's a different story. The total derivative that was mentioned in the post you linked is the kind of derivative I just described. If you want to know about that one, let me know and I'll edit this answer or make a new question and tag me. But they are different things and should not be confused (in fact, that "total derivative" is a special case of the derivative of a composite of functions).