A question concerning the directional derivative of a function

335 Views Asked by At

As I understand it, intuitively the directional derivative of a function $f$, at some point, describes its rate of change as one moves along a direction, specified by some vector $\mathbf{v}$, away from that point.

With this in mind I am trying to understand the mathematical formalism a bit deeper, but I'm unsure if I am doing so correctly. This is my understanding so far (apologies for any abuse of notation):

Let $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be some smooth function, and let $c:\mathbb{R}\rightarrow\mathbb{R}^{n}$ be some smooth curve in $\mathbb{R}^{n}$, with coordinates $(x\circ c)(t)\equiv x(t)=(x^{1}(t),\ldots ,x^{n}(t))$.

In a sufficiently small neighbourhood of a point $x(0)=a$ we can describe the curve linearly as $$x(t)=a+t\mathbf{v}$$ where $\mathbf{v}$ is the tangent vector to the curve $x(t)$ at the point $x(0)=a=(a^{1},\ldots ,a^{n})$.

Given this, we can then determine the rate of change in the function $f$ along the direction defined by $\mathbf{v}$ by evaluating $f$ along the line segment of the curve, $x(t)=a+t\mathbf{v}$, such that $f=(f\circ x)(t)=f(x(t))$. In doing so, we can now consider $f$ as a composite function of $t$, such that, upon taking its derivative with respect to $t$ we will determine its rate of change along $\mathbf{v}$. Indeed, $$\frac{df(x(t))}{dt}\bigg\vert_{t=0}=\sum_{i=1}^{n}\frac{\partial f(x(t))}{\partial x^{i}}\frac{d x^{i}}{dt}\bigg\vert_{t=0}=\nabla f(a)\cdot\mathbf{v}$$ where $\nabla f=\left(\frac{\partial f(x(t))}{\partial x^{1}},\ldots,\frac{\partial f(x(t))}{\partial x^{n}}\right)$ is the gradient of $f$, and $v^{i}=\frac{d x^{i}}{dt}$ are the components of the tangent vector at the point $x(0)=a$.

Would this be a correct understanding at all?

1

There are 1 best solutions below

29
On BEST ANSWER

You haven't really described the arbitrary curve $c(t)$ but instead replaced it by a linear one. This is legitimate but the original curve does not play any role in what you said. Notice that the chain rule formula, as you indicated, contains only the derivatives of the components of $x$ at $t=0$ (you should change $x=a$ to $t=0$). Therefore the choice of the curve is immaterial, and in fact one can define a tangent vector at a point $p$ in terms of an equivalence class of smooth curves through $p$.

Meanwhile the partials of $f$ should be evaluated at $x=a$.