I'm reading about derivative in Amann's Analysis I. The authors define derivative for functions defined on arbitrary subset of $\mathbb K \in \{\mathbb R, \mathbb C\}$. To ensure that the operation of taking limit is meaningful, the authors use the concept of limit point.
I'm trying to do the same for vector-valued functions. Could you have a check if my construction makes sense? It seems the chain rule still holds in this extended definition.
Let $(E, |\cdot|_E)$ and $(F, |\cdot|_F)$ be normed vector space. Let $X$ be an arbitrary non-empty subset of $E$. Notice that we do not require $X$ to be open in $E$. Let $f: X \to F$ and $a \in X$ be a limit point of $X$, there is at least one sequence $(x_n) \subset X \setminus \{a\}$ such that $x_n \to a$. We say that $f$ is differentiable at $a$ if there is $A \in \mathcal L(E, F)$ such that $$ \lim_{x \to a} \frac{f(x)-f(a) - A(x-a)}{|x-a|_E} = 0_F. $$
It is easy to verify that the following statements are equivalent, i.e.,
- $f$ is differentiable at $a$.
- there is $A \in \mathcal L(E, F)$ such that $$ f(x) = f(a)+ A(x-a) + o(|x-a|_E) \quad \text{as} \quad x \to a. $$
- there is $A \in \mathcal L(E, F)$ and $r:X \to F$ such that $r(a)=0$, $r$ is contious at $a$, and $$ f(x) = f(a)+ A(x-a) + r(x)|x-a|_E \quad \forall x \in X. $$
Here $o(|x-a|_E)$ is the Landau little-o symbol. We call such $A$ a derivative of $f$ at $a$. We do not know whether such $A$ is unique or not. Let denote it by $\partial f (a)$. Even if $\partial f (a)$ may not be unique, we always have if $f$ is differentiable at $a$, then it is continuous at $a$.
Theorem: Let $(G, |\cdot|_G)$ be a normed vector space. Let $f:X \to F$, $Y:=f(X)$, and $g:Y \to G$. Assume $a\in X$ is a limit point of $X$, and $b:=f(a)$ a limit point of $Y$. If $f, g$ are differentiable at $a,b$ respectively, then $g \circ f$ is differentiable at $a$, and $$ \partial (g \circ f) (a) = [\partial g (b)] \circ [\partial f a]. $$
Proof: Because $f$ is differentiable at $a$, there is $r:X \to F$ such that $r(a)=0$, $r$ is continous at $a$, and $$ f(x) = f(a)+ \partial f (a)(x-a) + r(x)|x-a|_E \quad \forall x \in X. $$
Because $g$ is differentiable at $b$, there is $s:Y \to G$ such that $s(b)=0$, $s$ is continuous at $b$, and $$ g(y) = g(b)+ \partial g (b)(y-b) + s(y)|y-b|_F \quad \forall y \in Y. $$
We replace $y$ by $f(x)$ and $b$ by $f(a)$, and get $$ \begin{align} g \circ f (x) &= g \circ f (a) + \partial g (b)[\partial f (a)(x-a) + r(x)|x-a|_E] \\ & \quad + (s \circ f (x)) |\partial f (a)(x-a) + r(x)|x-a|_E|_F. \end{align} $$
Because $\partial g (b)$ is linear, we get $$ \begin{align} g \circ f (x) &= g \circ f (a) + [\partial g (b)] \circ [\partial f (a)](x-a) + |x-a|_E \partial g (b)(r(x)) \\ & \quad + (s \circ f (x)) |\partial f (a)(x-a) + r(x)|x-a|_E|_F. \end{align} $$
Reorganizing a little bit, we get $$ g \circ f (x) = g \circ f (a) + [\partial g (b)] \circ [\partial f (a)](x-a) + t(x)|x-a|_E, $$ where $$ t(x) = \begin{cases} \partial g (b)(r(x)) + (s \circ f (x)) \left |\partial f (a) \left (\frac{x-a}{|x-a|_E} \right) + r(x) \right |_F & \text{if} \quad x \neq a \\ 0 & \text{if} \quad x = a. \end{cases} $$
Clearly, $t$ is continuous for all $x \neq a$. It remains to prove $\lim_{x \to a} t(a)=0$. We have $$ |\partial g (b)(r(x))|_G \le \|\partial g (b)\| \cdot |r(x)|_F \to 0 \quad \text{as} \quad x \to a. $$ and $$ \begin{align} &| s \circ f (x)|_G \cdot \left | \partial f (a) \left (\frac{x-a}{|x-a|_E} \right) + r(x) \right |_F \\ \le &|s \circ f (x)|_G \cdot \left ( \|\partial f (a)\| + |r(x)|_F \right ) \to 0 \quad \text{as} \quad x \to a. \end{align} $$
This completes the proof.