Difficulty interpreting high order derivatives in $\mathbb{R}^n$

114 Views Asked by At

If $f:U\subseteq\mathbb{R}^m\to \mathbb{R}^n$ is differentiable function then its derivative $$ f':U\to M_{n\times m}(\mathbb{R})\simeq\mathcal{L}(\mathbb{R}^m;\mathbb{R}^n) $$ can be seen, for each $x\in U$, as a linear transformation $f'(x):\mathbb{R}^m\to\mathbb{R}^n$.

If $f$ is of class $\mathcal{C}^2$ then its second order derivative is a function $$ f'':U\to\mathcal{L}(\mathbb{R}^m\times\mathbb{R}^m;\mathbb{R}^n) $$ that carries each $x\in U$ into a bilinear transformation $f''(x):\mathbb{R}^m\times\mathbb{R}^m\to\mathbb{R}^n$

and inductively if $f$ is of class $\mathcal{C}^k$ its $k$th derivative is a function $$ f^{(k)}:U\to\mathcal{L}(\mathbb{R}^m\times\cdots\times\mathbb{R}^m;\mathbb{R}^n). $$

Maybe I'm lacking some linear algebra background, but using the isomorphism $\mathcal{L}_2(\mathbb{R}^m\times\mathbb{R}^m;\mathbb{R}^n)\simeq\mathcal{L}(\mathbb{R}^m;\mathcal{L}(\mathbb{R}^m;\mathbb{R}^n))$ and its correspondent in the $k$-linear case I can see that the $k$th derivative is a $k$-linear map. But I don't see how to relate (in the case of the second derivative to simplify) the partial second derivatives $\frac{\partial f_i}{\partial x_j\partial x_k}(x)$ to the matrix of $f''(x)$ as a bilinear map. In the case of a bilinear form it would be a $m\times m$ matrix but it's not the case.

Can someone please explain me better?

P.S.: I have seen this question but I didn't understand his notation in the last part of the answer (and the first part which answers that question I'm ok with).

1

There are 1 best solutions below

4
On BEST ANSWER

Note that in the case that $f$ has codomain $\Bbb R^n$, $f''(x)$ is a bilinear form with codomain $\Bbb R^n$. As such, $f''(x)$ cannot (in its totality) be represented by a matrix. Instead, $f''(x)$ is a third order tensor and would more naturally be presented as $3$-dimensional array.

The $i$th coordinate of the output of $f''(x)$, however, is a bilinear map to $\Bbb R$, which means that it can be represented as a matrix. The matrix corresponding to this bilinear map is the same as the matrix corresponding to $f_i''(x)$, where $f_i$ denotes the $i$th component of $f(x)$. In particular, $f_i''(x)$ is represented by the Hessian matrix of $f_i$.

You can accordingly think of $f''(x)$ as a $3$-dimensional array where each of $n$ $2$-dimensional layers is a Hessian of one of the components $f_i$ of $f$.


Here's a proof that we have $f''(x)(u,v) = \frac{\partial f^2}{\partial u \partial v}$ for vectors $u,v$. If we go back to the $\mathcal{L}(\mathbb{R}^m;\mathcal{L}(\mathbb{R}^m;\mathbb{R}^n))$ definition, $f''(x)(v)(\cdot)$ is defined such that for $h \in \Bbb R$, $$ f'(x + hv)(u) = f'(x)(u) + h\,f''(x)(v)(u) + o(h). $$ On the other hand, we already know that $f'(x)(u) = \frac{\partial f}{\partial u}(x)$. With that, we have $$ \frac{\partial f}{\partial u}(x + vh) = \frac{\partial f}{\partial u}(x) + h\,f''(x)(v)(u) + o(h) \implies\\ f''(x)(v)(u) = \frac 1h\left(\frac{\partial f}{\partial u}(x + vh) - \frac{\partial f}{\partial u}(x) + o(h)\right). $$ If we take a limit as $h \to 0$, the conclusion follows.