The directional derivative equals dot product of gradient and a unit vector. But what if the function is not totally differentiable?

499 Views Asked by At

The directional derivative is the dot product of gradient and a unit vector. But what if the function is not totally differentiable? Is it an implicit assumption that formula only applies to totally differentiable functions?

Apostol Volume 2 does not really explicitly spell it out, and I am convinced that the formula only holds when the function is totally differentiable, I just want some confirmation in this regard. Furthermore, in many problems when the directional derivate is being asked to be computed, the author simply invokes the above formula, without PROVING total differentiability.

So this then begs the question: Does the formula make any sense if the function is not totally differentiable?

In other words, is the gradient a concept that 'exists' on its own or is defined 'through' total differentiability, and therefore implicitly subsumes the prerequisite of total differentiability?

1

There are 1 best solutions below

1
On BEST ANSWER

I think most of what needs to be said is in the comments, but in an attempt to set things out clearly:

If $f\colon \Omega\to \mathbb R$ is a function defined on an open set $\Omega$ of $\mathbb R^n$ (with the normal Euclidean notion of distance say) then for $a \in \Omega$ and $v \in \mathbb R^n$, the directional derivative of $f$ at $a$ in the direction $v$ is $$ \lim_{t \to 0} \frac{f(a+t.v)-f(a)}{t} $$ when this limit exists. It is denoted in various ways, I'll use $\partial_vf(a)$. You can check that if $r \in \mathbb R$ then $\partial_{r.v} f(a) = r.\partial_v f(a)$, that is $\partial_v f$ is compatible with scalar multiplication (so it suffices to compute directional derivatives for vectors of norm $1$ for example).

In deciding what it means for $f$ to be differentiable at $a$, there are (at least) the following possible options:

  1. Require only that $\partial_vf(a)$ exists for all $v \in \mathbb R^n$ -- or equivalently $\partial_v f(a)$ exists for all $v \in S^{n-1} = \{v \in \mathbb R^n: \|v\|=1\}$.
  2. In addition to 1., require that there is a linear map $T\colon \mathbb R^n \to \mathbb R^n$ such that $\partial_v f(a) = T(v)$.
  3. In addition to 1. and 2. require that $\frac{f(a+tv)-f(a)-T(t.v)}{t}\to 0$ uniformly in $v \in S^{n-1}$, that is, require $ \lim_{h \to 0}\frac{\|f(a+h)-f(a)-T(h)\|}{\|h\|}=0$.

It is easy to see that, if $f$ satisfies 3., then $\partial_vf(a) = T(v)$, and as $T$ is usually denoted $Df_a$, this becomes $\partial_v f(a) = Df_a(v)$. Using the standard basis to associate a matrix to the linear map $Df_a$, this becomes the dot product formula in the OP.

If condition 1. holds then $\partial_v f(a)$ is known as the Gateaux derivative or Gateaux differential of $f$ at $a$. Oddly, as far as I know, condition 2. doesn't seem to have a name, while condition 3., the one most widely taught, is called the Frechet derivative.

It has a number of things going for it, such as, if $f$ has a Frechet derivative at $a$ then $f$ is continuous at $a$, whereas this is false for functions satisfying conditions 1 and 2.

Examples:

i) A function which satisfies $1.$ and not $2.$ at $a=(0,0)\in\mathbb R^2$ is $$ f_1(x,y) = \left\{ \begin{array}{cc} \frac{x^2y}{x^2+y^2} & (x,y)\neq (0,0)\\ 0, & (x,y)=(0,0)\end{array}\right. $$ Indeed $\partial_vf_1(0)= f_1(v)$.

ii) A function which satisfies 2. and not 3. is given in the comments above. If we write $1_A$ for the indicator function of $A$, that is $1_A(x)=1$ if $x\in A$ and $1_A(x)=0$ otherwise, then we get a similar example by considering $U = \{(x,y):0<y<x^2<1\}$: its indicator function $1_U$ satisfies $2.$ but not $3.$ at $a=(0,0)$. Indeed $1_U$ is of course not continuous at $(0,0)$, since $(0,0) \in \overline{U}$.