Definition of derivative as a linear operator being applied to a vector

44 Views Asked by At

I have been told that, given a differentiable function $f: \mathbb{R}^n\longrightarrow \mathbb{R}$, we can view $f'(x)$ as a linear operator from $\mathbb{R}^n$ to $\mathbb{R}$ for any $x$, which makes sense because it is a vector, and thus a linear operator. So $f'(x)[y] = \nabla f(x)^Ty$, basically by definition. But later in class, we used $f'(x)[y] = \underset{h\rightarrow 0^+}{\text{lim}}\frac{f(x + hy) - f(x)}{h}$, which I don't understand. Specifically, why is $y$ showing up inside the limit? To me, $f'(x)[y]$ means first take the derivative of $f$ at $x$, and then apply the result to $y$. So $y$ shouldn't appear in the limit definition of the derivative of $f$ at $x$, and yet here it just looks like the $y$ and the limit have been fused together.

So it seems like the two values are supposed to be equivalent, so should I just be treating the above equation as the definition for $f'(x)[y]$? If so, is there an easy way to see that the two notions of $f'(x)[y]$ are equivalent, i.e. that $\nabla f(x)^Ty = \underset{h\rightarrow 0^+}{\text{lim}}\frac{f(x + hy) - f(x)}{h}$, where $\nabla f(x) = \underset{h\rightarrow 0^+}{\text{lim}}\frac{f(x + h) - f(x)}{h}$?

By the way, I don't think it really matters whether or not the limit is a one-sided or two-sided limit, it was just posed to me as a one-sided limit.

2

There are 2 best solutions below

5
On BEST ANSWER

Suppose that you define $f'(x)(y)$ as $\nabla f(x)^Ty$. Since$$\nabla f(x)=\left(\frac{\partial f}{\partial x_1}(x),\ldots,\frac{\partial f}{\partial x_n}(x)\right),$$then, if $\{e_1,\ldots,e_n\}$ is the standard basis of $\Bbb R^n$, we have, for each $k\in\{1,2,\ldots,n\}$, $f'(x)(e_k)=\nabla f(x)^Te_k$, since both numbers are equal to $\frac{\partial f}{\partial x_k}(x)$. But then, if $y\in\Bbb R^n$, $y$ can be written as $a_1e_1+a_2e_2+\cdots+a_ne_n$ and therefore, by linearity,$$f'(x)(y)=\nabla f(x)^Ty,$$since both numbers are equal to$$a_1\frac{\partial f}{\partial x_1}(x)+\cdots+a_n\frac{\partial f}{\partial x_n}(x).\tag1$$Now, note that$$\lim_{h\to0}\frac{f(x+he_k)-f(x)}h=\frac{\partial f}{\partial x_k}(x).$$So, again by linearity,$$\lim_{h\to0}\frac{f(x+hy)-f(x)}h=(1)=f'(x)(y).$$

1
On

A scalar function $f: \>{\mathbb R}^n\to{\mathbb R}$ is differentiable at the point $p\in{\mathbb R}^n$ if there exists a linear map $A:\>T_p\to{\mathbb R}$ such that $$f(p+y)-f(p)=A.y+o(|y|)\qquad(y\to0)\ .\tag{1}$$ This map $A$ is called the derivative of $f$ at $p$, and is denoted by $f'(p)$, $\>df(p)$, or similar. In standard coordinates $f'(p)$ is represented by the vector $\nabla f(p)$, so that $$f(p+y)-f(p)=\nabla f(p)\cdot y+o(|y|)\qquad(y\to0)\ .$$ If $f$ is differentiable at $p$, and a vector $y\in T_p$ is given you can look at the directional derivative $$D_y f(p):=\lim_{t\to0+}{f(p+t y)-f(p)\over t}\ .\tag{1}$$ As $${f(p+t y)-f(p)\over t}={\nabla f(p)\cdot(ty)+o(t y)\over t}\qquad(t\to0+)$$ we obtain $$D_y f(p)=\nabla f(p)\cdot y=df(p).y\ .$$ But sometimes the directional derivatives $(1)$ are defined even when the function $f$ is not differentiable at $p$. As an example take function $f(x):=|x|$ and $p=0$. This $f$ is not differentiable at $0$, but for any $y\in T_0$ we have $$D_y f(0)=|y|\ .$$