What is the meaning of directional derivative for a linear function $f(x) = ax$?

123 Views Asked by At

So the directional derivative is defined as,

$$f_d(x) = \lim_{ s \to 0^+} \frac{f(x+sd) - f(x)}{s}$$

If $f(x) = ax$, then $f_d(x) = a\cdot d, d \in \mathbb{R}$

But why does this make sense? What is the interpretation of $f_d(x)$?

My rate of change/slope is a constant $a$ no matter where I am on the $x$-axis.

So why is it that increasing $d$ will increase my directional derivative, when the slope is constant throughout?

1

There are 1 best solutions below

1
On

To understand directional derivatives, let us compare it with Fréchet derivative, commonly used in real analysis and calculus.

Let $f$ be real-valued function defined on an inner-product linear space $V$, then the Fréchet derivative at $x\in V$ is defined as a vector $ \nabla f(x) \in V$ such that the directional derivative for all $d\in V$, can be written as:

$$f_d(x) = \lim_{ s \to 0^+} \frac{f(x+sd) - f(x)}{s}= \langle \nabla f(x),d \rangle \forall d \in V$$

From this, for the case $V=\mathbb R$, you can see that

  1. Even for a univariate function, the directional derivative may exist in all directions at a point (it is Gateaux differentiable), but the Fréchet derivative may not, for example, consider $f(x)=|x|$ at $x=0$ (it is not Fréchet differentiable, commonly called differentiable in real analysis). In fact, for (Fréchet) differentiable functions, at any $x\in \mathbb R$ we have: $$f_d(x)=f'(x)\times d$$

whit $d=y-x$ for any $y \in \mathbb R$. In particular, for a linear function $f(x)=ax$, $f_d(x)=a\times d$.

  1. The directional derivative is an asymptotic amount of change of $f$ by moving along the given direction $d$, while the Fréchet derivative (if exists) is an asymptotic rate of change for one unit of increment at $x \in \mathbb R$; i.e., $d=(x+1)-x=1$.

For $V=\mathbb {R}^n$, $\nabla f(x)$ is broadly called the gradient vector at $x$, given by

$$\nabla f(x)=\left(\frac{\partial f(x)}{\partial x_1}, \dots, \frac{\partial f(x)}{\partial x_n} \right)^T$$

The same interpretation for $V=\mathbb {R}^n$ cannot be presented for all $d$ with $\|d\|=1$ as the norm induced by the common inner product $ \langle x,y \rangle =x^{T}y$ is Euclidean. However, for $d=(x+e_i)-x$ again we have:

$$f_{d=e_i}(x)=\frac{\partial f(x)}{\partial x_i}\times 1$$

where $e_i$ is a vector whose $i$th element is 1, and other elements are 0.

Hope this is helpful!