Linearity of the directional derivative

2.8k Views Asked by At

I'm confused by the highlighted remark in the text quoted below. Isn't $df_x$ linear by the very definition? At least Rudin and Apostol define derivative to be a linear map.

Actually, as I write this question I perhaps came up with an answer, but since I'm not sure whether I'm right, let me state my version here and see whether it is correct.

I think what is meant in the text below is that their definition of the derivative at $x$ evaluated at $h$ does not assume linearity. Whereas the definitions of Rudin and Apostol assume linearity, and then they (at least Apostol) prove (see Theorem 12.3 here Expressing directional derivative in terms of partial derivatives) that the derivative of $f$ at $x$ evaluated at $h$, $df_x(h)$, actually equals the directional derivative of $f$ at $x$ in the direction of $h$. Am I interpreting what is going on correctly?

Also, if we assume the definition given in the text below, how to prove that $df_x$ is linear? I don't see how it follows after playing with the quotient inside the $\lim$ sign. (I don't have Spivak at hand.)

enter image description here

1

There are 1 best solutions below

0
On

You can show that it is linear if the function is at least $C^1$.

You have $$df_x(ah)=\lim_{t\rightarrow 0}{{f(x+tah)-f(x)}\over t}=a \lim_{t\rightarrow 0}{{f(x+tah)-f(x)}\over{at}}.$$ Write $u=at$, then you have $$a\lim_{t\rightarrow 0}{{f(x+tah)-f(x)}\over{at}}=a \lim_{u\rightarrow 0}{{f(x+uh)-f(x)}\over u}=adf_x(h).$$

$$df_x(h_1+h_2)=\lim_{t\rightarrow 0}{{f(x+t(h_1+h_2))-f(x)}\over t}=\lim_{t\rightarrow 0}{{f(x+t(h_1+h_2))-f(x+th_1)}\over t}+ \lim_{t\rightarrow 0}{{f(x+th_1)-f(x)}\over t},$$ if the latter limits exist.

Suppose that the function is $C^1$. By the mean value theorem and continuity of $x\mapsto df_x(h_2)$ you have $f(x+th_1+th_2)=f(x+th_1)+tdf_{x+th_1}(h_2)+O(th_2)$, and you deduce that $f(x+t(h_1+h_2))-f(x+th_1)=tdf_{x+th_1}(h_2)+O(th_2)$. This implies that $\lim_{t\rightarrow 0}{{f(x+t(h_1+h_2))-f(x+th_1)}\over t}=\lim_{t\rightarrow 0}df_{x+th_1}(h_2)=df_x(h_2)$.

Using the expression above, you deduce that $df_x(h_1+h_2)=df_x(h_1)+df_x(h_2)$.