Proving chain rule for Caratheodory derivative

202 Views Asked by At

I am trying to follow the argument given in the following papers:

  1. S. Kuhn, "The Derivative a la Caratheodory" (1991);
  2. E. Acosta G., C. Delgado G., "Frechet vs. Caratheodory" (1994);
  3. S. Arora, H. Browne, D. Daners, "An alternative approach to Frechet derivatives" (2019).

Definition of differentiability. Function $f : \mathbb{R}^n \to \mathbb{R}^m $ is said to be differentiable (in a sense of Caratheodory) at point $x \in \mathbb{R}^n$ if there is a one-parameter family of linear functions $\Phi_{x, \cdot} : \mathbb{R}^n \to L(\mathbb{R}^n, \mathbb{R}^m)$ that for all $y$ satisfies the following condition $$ \Phi_{x,y}(y-x) = f(y)-f(x) $$ and is continuous at $x$, in other words, $$ \lim \limits_{y \to x} \Phi_{x,y} = \Phi_{x,x}. $$

Here, limit is to be understood in the usual sense where there is some metric on the space of all linear functions from $\mathbb{R}^n$ to $\mathbb{R}^m$ (usually taken to be operator norm). Derivative of $f$ at $x$ is then said to be linear function $\Phi_{x,x}$.

Proof of the chain rule given in the papers. Let us assume that we are given functions $f : \mathbb{R}^n \to \mathbb{R}^m$ and $g : \mathbb{R}^m \to \mathbb{R}^p$ which are differentiable at $x \in \mathbb{R}^n$ and $f(x) \in \mathbb{R}^m$ respectively. In that case, let us start by thinking of family of linear functions that might be reasonable for $g \circ f$. Here, we use that $f$ and $g$ are differentiable and we use some one-parameter family of linear functions $\Phi^{f}_{x, \cdot}$ and $\Phi^{g}_{f(x), \cdot}$.

$$ (g \circ f)(y) - (g \circ f)(x) = \Phi^{g}_{f(x), f(y)}(f(y)-f(x)) = (\Phi^{g}_{f(x), f(y)} \circ \Phi^{f}_{x,y}) (y-x)$$

This motivates to consider one-parameter family of linear functions given as follows.

$$ \Phi^{g \circ f}_{x,\cdot} = \Phi^{g}_{f(x), f(\cdot)} \circ \Phi^{f}_{x,\cdot} $$

Now, the only remaining thing is to show that it is continuous at $x$.

Part of the proof I do not understand: The claim is that this one-parameter family of linear function $\Phi^{g \circ f}_{x,\cdot} $ is continuous because $\Phi^{g}_{f(x), \cdot}$, $\Phi^{f}_{x, \cdot}$ and $f$ are all continuous functions and composition of continuous functions is again continuous. I have trouble understanding this as I think that the composition of one-parameter families of linear functions is not with respect to variable $y$.

My approach: Instead, I would propose the following proof where the key idea is not that the composition of continuous functions is continuous but rather that if we take operator norm then it has submultiplicativity property, so the proof is similar to the proof of limit of product where the product is now the composition.

Use that functions are linear and triangle inequality to have the following.

$$|\Phi^{g}_{f(x), f(y)} \circ \Phi^{f}_{x,y}-\Phi^{g}_{f(x), f(x)} \circ \Phi^{f}_{x,x}| = $$ $$ = |(\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}) \circ (\Phi^{f}_{x,y}-\Phi^{f}_{x,x}) + (\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}) \circ \Phi^{f}_{x,x} + \Phi^{g}_{f(x), f(x)} \circ (\Phi^{f}_{x,y} - \Phi^{f}_{x,x}) | $$ $$ \leq |(\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}) \circ (\Phi^{f}_{x,y}-\Phi^{f}_{x,x})| + | (\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}) \circ \Phi^{f}_{x,x} | + |\Phi^{g}_{f(x), f(x)} \circ (\Phi^{f}_{x,y} - \Phi^{f}_{x,x}) |$$

Now, use submultiplicative property of operator norm to bound norm of composition by a multiplication of individual norms.

$$|\Phi^{g}_{f(x), f(y)} \circ \Phi^{f}_{x,y}-\Phi^{g}_{f(x), f(x)} \circ \Phi^{f}_{x,x}| $$ $$ \leq |\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}||\Phi^{f}_{x,y}-\Phi^{f}_{x,x}| + |\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}| | \Phi^{f}_{x,x} | + |\Phi^{g}_{f(x), f(x)}||\Phi^{f}_{x,y} - \Phi^{f}_{x,x}| $$

Now, from continuity of one-parameter family of linear functions and from continuity of $f$ (as it is differentiable), we get that there is $\delta_1 > 0$ such that if $0 < |y-x| < \delta_1$ then $|f(y) - f(x)| < \delta_2$ and $|\Phi^{g}_{f(x), f(y)}-\Phi^{g}_{f(x), f(x)}|< \textrm{min}(1,\varepsilon)/(3+| \Phi^{f}_{x,x} |)$. Also, there is $\delta_2 > 0$ such that if $0 < |y-x| < \delta_2$ such that $|\Phi^{f}_{x,y}-\Phi^{f}_{x,x}| < \textrm{min}(1,\varepsilon)/(3+|\Phi^{g}_{f(x), f(x)}|)$. Then, taking $\delta = \textrm{min}(\delta_1, \delta_2)$ gives the desired results.

Problems with my approach: even if we assume that my proof is correct, this is somewhat in contradiction with what authors claim about why Caratheodory derivative might be useful. The usual claim is that all proofs become easier as one just needs to consider continuity of functions and does not rely on many computations with $\varepsilon-\delta$. On the other hand, my proof shows that Caratheodory derivative approach requires many details to be proved, for example, it requires that out of all norms we use norm which is submultiplicative (not obvious to me why such choice of norm would be natural or obvious), and proof required to work with $\varepsilon - \delta$. Also, it is not clear to me how (and if possible) to rewrite my approach in a more concise way using only that composition of continuous functions is continuous (or a similar result).