Definition of the total derivative.

2k Views Asked by At

I am trying to understand the following definiton.

$f:\mathbb{R}^n \rightarrow \mathbb{R}^m$ . The total derivative of $f$ in point $a$ is the unique linear map $Df|_a$ such that $$\lim_{h \rightarrow 0}\dfrac{f(a+h)-f(a)-Df|_a(h)}{||h||} = 0$$

Could someone explain why this definition works?

-Why should we divide by $||h||$?

-Why is $Df|_a$ linear?

-How should I interpret this linear map $Df|_a$, what is the meaning of the total derivative?

4

There are 4 best solutions below

4
On BEST ANSWER

This is one of the most fundamental definitions in all of analysis.

It says that the increment $\Delta f:=f(a+h)-f(a)$ of the function value should in first approximation be a linear function of the increment $h$ attached at the point $a$. In other terms: We want $$f(a+h)-f(a)=Lh +r(h)\qquad(|h|\ll1)\ ,\tag{1}$$ whereby the error $r(h)$ should be smaller by magnitudes than the linear term $Lh$ when $h$ is small. Now in general $|Lh|$ will be of order $|h|$ for "most" $h$. This means that we should require that $$\lim_{h\to0}{|r(h)|\over |h|}=0$$ in order to impart any real content to $(1)$. It turns out that this condition determines $L$ uniquely. If it can be satisfied then $f$ is called differentiable at $a$, and one denotes the resulting $L$ by $Df\bigr|_a$, or similar.

6
On

The division by $\|h\|$ here is exactly analogous with the division by $h$ in the definition of the (standard) derivative of a real-valued function of a real variable:

$\dfrac{df}{dx}=\underset{h\rightarrow 0}{\lim}\dfrac{f(x+h)-f(x)}{h}$

To answer the second question, $Df|_a$ is linear because it satisfies the linearity property, that is it commutes with addition and scalar multiplication. This is a consequence of how it is defined and is not actually the hard to prove (try using the definition to show $Df|_a+Dg|_a=D(f+g)|_a$ and $D(\alpha f)|_a=\alpha Df|_a$ directly; hint, use some linear algebra)

To answer the final question, it is the natural analog of the familiar derivative of a function $f:\mathbb{R} \rightarrow \mathbb{R}$ for the case of a function from $f:\mathbb{R}^n \rightarrow \mathbb{R}^m$.

0
On

I think it's much easier to understand if you write the definition as "$Df_{|a}$ is the unique linear map $L$ (if it exists) satisfying $f(x) = f(a) + L(x-a) + o(||x-a||)$ when $x\rightarrow a$".

Maybe it's even clearer if you write "$Df_{|a}$ is the linear part of the unique affine map $A$ (if it exists) satisfying $f(x) = A(x) + o(||x-a||)$ when $x\rightarrow a$".

So you can see that $A$ is the best possible affine approximation of $f$ near $a$ (because the error you make by replacing $f$ with $A$ is negligible compared to any affine map), and $Df_{|a}$ is the linear part of this affine approximation.

0
On

Just to add a bit to the previous answers, you can check that this definition meets the "usual" requirement for differentiability if $f:\mathbb R\to \mathbb R.$

Take $x=a$. Then, according to the "new" definition, to find the derivative of $f$ at $a$, we seek an

$L(a):\mathcal L(\mathbb R,\mathbb R)\to \mathcal L(\mathbb R,\mathbb R)$ such that

$\lim _{h\to 0}\frac{f(a+h)-f(a)-L(a)h}{h}=0$.

Now, the linear tranformations from $\mathbb R\to \mathbb R$ are of the form $fh=bh$ for some $b\in \mathbb R$, so we have now, with $f=L(a)$

$\lim _{h\to 0}\frac{f(a+h)-f(a)-bh}{h}=0$, which simplifies to

$\lim _{h\to 0}\left ( \frac{f(a+h)-f(a)}{h}-b \right )=0$ so that $b=f'(a)$, that is, $L(a)h=f'(a)h$