What does it mean for a vector function to be $C^k$?

155 Views Asked by At

So, I know for a function $f :\mathbb{R}^m\to\mathbb{R}^n$, $f$ is differentiable on $U\subset\mathbb{R}^m$ iff for every point $a\in U$ there exists a linear transformation, call it $Df(a)$ such that $$ \lim_{\vec{h}\to 0}\frac{\|f(a + \vec{h}) - f(a) - Df(a)\vec{h}\|}{\|\vec{h}\|} = 0 $$ You can determine from this that if $f$ is differentiable then all the partial derivatives of $f$ exist on $U$ and $Df(a)$ is the Jacobian of $f$. We write $f\in C^1(U)$ to say it is once-differentiable on $U$.

So, what exactly does it mean to say $f\in C^k(U)$ for $U\subset\mathbb{R}^m$, and $k>1$? My only thought is that it requires all the partial derivatives of order $k$ to exist, but this is not enough information for the $C^1$ case since not only must the partials exist, but the above limit with $Df(a)$ equal to the Jacobian must also be zero (which are not necessarily always simultaneously true). My second thought was that, at least for the $k = 2$, the Hessian of $f$ must exist, and some limit involving the Hessian must go to zero, but I can't figure out what it is.

There really is, as far as I can tell, no reasonable idea of the "second-derivative" of a vector function exists. The "derivative" of a vector function, as I understand it, is $Df(a)$, which is a function that maps a vector $a$ to a matrix. So if we were to take the derivative of this, the "second" derivative, what kind of object would it be? Would it be a tensor?

2

There are 2 best solutions below

5
On BEST ANSWER

Well, that's pretty simple if you take the abstract point of view: as you mention it, the differential of a map $f\colon U\subset \mathbf R^m\longrightarrow \mathbf R^n$ at a point $a\in\mathbf R^m$ is a linear map map $Df(a)\in \mathcal L(\mathbf R^m,\mathbf R^n)$ which is tangent to $f$ at $a$.

Now, if $f$ is differentiable at every point of $U$, you define a (first order) differential map \begin{align} Df\colon U\subset\mathbf R^m & \longrightarrow \mathcal L(\mathbf R^m,\mathbf R^n),\\ a & \longmapsto Df(a). \end{align} This differential map $Df$ may in turn be differentiable at a point $a\in U$. You then obtain a second order differential $D^2f(a)$, which is a linear map in $\;\mathcal L\bigl(\mathbf R^m,\mathcal L(\mathbf R^m,\mathbf R^n)\bigr)$, tangent to $Df$ at $a$.

As we have a canonical isomorphism $\;\mathcal L\bigl(\mathbf R^m,\mathcal L(\mathbf R^m,\mathbf R^n)\bigr)\simeq\mathcal L^2(\mathbf R^m,\mathbf R^n)$ (the set of bilinear maps from $\mathbf R^m$ to $\mathbf R^n$, we identify $D^2f(a)$ with the corresponding bilinear map, which is represented by the Jacobian matrix once a basis has been chosen.

0
On

Thoerem: Let $U\subset \mathbb{R}^m$ be open and let $f:U\longrightarrow \mathbb{R}$ be such that, for each $i\in\{1,\dots,m\}$, the partial derivative $f_i=\frac\partial{\partial x_i}f$ exists and is continuous throughout $U$. Then $f\in C^1(U)$.

Proof: For each $x\in U$, $i\in\{1,\dots,m\}$ and $h_i\in\mathbb{R}$ such that $x+h_ie_i \in U$, we have that

$$f(x+h_ie_i)=f(x)+h_if_i(x)+h_i\epsilon_{i}(h_i;x), \tag{1}$$

where $\epsilon_{i}$ satisfies $\lim_{t\to0}\epsilon_{i}(t;x)=0$ for all $x$. This limit condition implies that

$$\lim_{h_i\to0}\frac{f(x+h_ie_i)-f(x)}{h_i}-f_i(x)=\lim_{h_i\to0}\epsilon_i(h_i;x)=0$$

and hence that $f_i(x)=\lim_{h_i\to0}\frac{f(x+h_ie_i)-f(x)}{h_i}$. Notice that because $U$ is open, there is always a neighborhood of $0$ in $\mathbb{R}$ of valid $h_i$.

The value of $\epsilon_i$ when $h_i=0$ does not change the validity of equation $(1)$, so we set $\epsilon_i(0,x)=0$ for all $x$. It then follows that $\epsilon_i(\cdot\,; x)$ is continuous in $h_i$ for all $x$, and that $\epsilon_i(0\,;\, \cdot)$ is continuous in $x$.
Moreover, it follows from the continuity of $f_i$ that $\epsilon_{i}(h_i\,;\,\cdot)$ is continuous in $x$ for all $h_i\neq 0$, and hence for all $h_i$.

Thus, $\epsilon_{i}$ is continuous and $\epsilon_{i}(0;x)=0$ for all $x$.

Now, let $a\in U$ and $h=(h_1,\dots,h_m)\in\mathbb{R}^m$ be such that $a+h\in U$. Then:

\begin{align} f(a+h)-f(a) =f\left(a+\sum_{i=1}^mh_ie_i\right)-f(a) =\sum_{k=1}^m\,f\left(a+\sum_{i=1}^{k}h_ie_i\right)-f\left(a+\sum_{i=1}^{k-1}h_ie_i\right), \end{align}

where the sum $\sum_{i=1}^{-1}$ is taken to be the empty sum, $0$. We may then rewrite it as

\begin{align} f(a+h)-f(a)&=\sum_{k=1}^m\,f\left(\left(a+\sum_{i=1}^{k-1}h_ie_i\right)+h_ke_k\right)-f\left(a+\sum_{i=1}^{k-1}h_ie_i\right)\\ &=\sum_{k=1}^m\,h_k\cdot f_k\left(a+\sum_{i=1}^{k-1}h_ie_i\right) + h_k\epsilon_{k}\left(h_k;a+\sum_{i=1}^{k-1}h_ie_i\right) \end{align}

Consider $\nabla f:U\longrightarrow \mathbb{R}^m$ the gradient of $f$, that is, $\nabla f(x)=(f_1(x),\dots,f_m(x))$. We define $Df(a)h=\langle\nabla f(a), h\rangle=\sum_{i=1}^m\,h_if_i(a)$. Then:

\begin{align} f(a+h)-f(a)-Df(a)h &=\sum_{k=1}^m\,h_k\cdot \left(f_k\left(a+\sum_{i=1}^{k-1}h_ie_i\right) -f_k(a)\right) + h_k\epsilon_{k}\left(h_k;a+\sum_{i=1}^{k-1}h_ie_i\right) \end{align}

Now, as $h\to 0$, we have that

$$f_k\left(a+\sum_{i=1}^{k-1}h_ie_i\right) -f_k(a)\longrightarrow 0$$

because the $f_k$ are continuous. Moreover, $|h_k|\leq \lVert h\rVert$, so $\frac{h_k}{\lVert h \rVert}$ remains bounded as $h\to 0$ and hence $\frac{h_k}{\lVert h \rVert}\cdot \left(f_k\left(a+\sum_{i=1}^{k-1}h_ie_i\right) -f_k(a)\right)$ vanishes as $h\to 0$.

Finally, in a similar fashion we have that $\epsilon_{k}\left(h_k;a+\sum_{i=1}^{k-1}h_ie_i\right) \to 0$ as $h\to0$, because $\epsilon_{k}$ is continuous and it vanishes whenever the first argument is $0$. Since $\frac{h_k}{\lVert h \rVert}$ remains bounded as $h\to 0$, once again the product $\frac{h_k}{\lVert h \rVert}\cdot \epsilon_{k}\left(h_k;a+\sum_{i=1}^{k-1}h_ie_i\right)$ vanishes as $h\to 0$.

It follows that the limit $\lim_{h\to0}\frac{f(a+h)-f(a)-Df(a)h}{\lVert h \rVert}$ exists and is $0$, that is, $f$ is differentiable at $a$ and $Df(a)=\langle\nabla f(a),\cdot\rangle$.

Since $a$ was arbitrary, we have that $Df$ is given by $x\mapsto \langle \nabla f(x),\cdot\rangle$. Finally, because the $f_i$ are continuous, $Df$ itself is continuous, which concludes the proof. $\square$