Directional derivative matrix representation

1.5k Views Asked by At

Consider the definition of the directional derivative of a $C^\infty$ differentiable function $f:\mathbb R^n\to\mathbb R^m$ at the point $x\in\mathbb R^n$ and in the direction $u\in\mathbb R^n$ ($\|u\|_2=1$):

$$ Df(x)(u) := \lim_{h\to 0}\frac{f(x+hu)-f(x)}{h}. $$

$Df(x):\mathbb R^n\to\mathbb R^m$ is a linear map, thus it admits a matrix representation. First question: what is the matrix representation of $Df(x)$ with respect to the standard bases of $\mathbb R^n$ and $\mathbb R^m$ (i.e. $\{e_i\}_{i=1}^n$ and $\{e_i\}_{i=1}^m$ respectively)?

Next, consider the second order direction derivative $D^2f(x):\mathbb R^n\times\mathbb R^n\to\mathbb R^m$, i.e. the "directional derivative of the directional derivative". It too is a linear map, thus it admits a matrix representation. Second question: what is the matrix representation of $D^2f(x)$, again with respect to the standard bases?

Note: I am using the "matrix" word loosely. My current thinking leads me to that $D^2f(x)$ is a tensor (like a matrix "box" with a depth dimension as well as rows and columns).

Very importantly, I would like your matrix representations to collapse nicely to the gradient vector for $Df(x)$ and to the Hessian matrix for $D^2f(x)$ in the special case $m=1$.

Background: this is not homework. I ask this question because we learned about these directional derivatives in class, but I did not manage to understand the above questions. I also could not find a good book which goes through this.

Bonus: a (tensor?) representation for the $k$-th order direction derivative $D^kf(x)$ would be amazing.

1

There are 1 best solutions below

3
On

It might be easier when considering directional derivatives to look at $\phi(t) = f(x+tu)$.

Then $\phi'(t) = \sum_j u_j {\partial f(x+tu) \over \partial x_j}$ and $\phi''(t) = \sum_i \sum_j u_i u_j {\partial f^2(x+tu) \over \partial x_i x_j}$ and so $\phi'(0) = \sum_j u_j {\partial f(x) \over \partial x_j}$, $\phi''(0) = \sum_i \sum_j u_i u_j {\partial f^2(x) \over \partial x_i x_j}$.

It is clear that this process can be continued. In general, the map can be identified as a multilinear map $(\mathbb{R}^n \times \cdots\times \mathbb{R}^n )\to \mathbb{R}^m$ defined by $(u_1,..., u_k) \to \sum_{i_1} \cdots \sum_{i_k} [u_1]_{i_1} \cdots [u_1]_{i_k} {\partial f^k(x) \over \partial x_{i_1} \cdots x_{i_k}}$, evaluated at $(u,...,u)$.