Theorem 6.4.1. Let $E$ be a subset of $\mathbb{R}^n$, and let $F$ be a subset of $\mathbb{R}^m$. Let $f : E \to F$ be a function, and let $g : F \to \mathbb{R}^p$ be another function. Let $x_0$ be a point in the interior of $E$. Suppose that $f$ is differentiable at $x_0$, and that $f(x_0)$ is in the interior of $F$. Suppose also that $g$ is differentiable at $f(x_0)$. Then $g \circ f: E \to \mathbb{R}^p$ is also differentiable at $x_0$, and we have the formula $$(g \circ f)'(x_0) = g'(f(x_0)) f'(x_0).$$
Exercise 6.4.3. Prove Theorem 6.4.1. (Hint: you may wish to review the proof of the ordinary chain rule in single variable calculus. The easiest way to proceed is by using the sequence-based definition of limit)
In a single variable case, I first define $G_0(y) = \frac{g(y) - g(f(x_0))}{y-f(x_0)}$, and then extend it to $G(y) = G_0(y)$ if $y \not= f(x_0)$ and $g'(f(x_0))$ if $y = f(x_0)$. Then, we have that $$(g \circ f)'(x_0) =\lim_{n\to \infty} G(f(x_n)) \frac{f(x_n) - f(x_0)}{x_n- x_0} =g'(f(x_0)) f'(x_0).$$
I am trying to prove a multi variable case in a similar way. First, note that $(g \circ f)'(x_0) = ((g \circ f)'_1(x_0), ... , (g \circ f)_p'(x_0))$ (is this true?). Then, our task is to show the chain rule for $(g \circ f)_i: E \to \mathbb{R}$. That is, I need to show that $$\lim_{n \to \infty} \frac{(g \circ f)_i(x_n) - (g \circ f)_i(x_0)}{||x_n -x_0||} = (g'(f(x_0)) f)_i'(x_0). $$ But, I am not sure how to proceed from here. Am I using the right approach? I appreciate if you give some help.
By definition, that $g$ is differentiable at $y_0 = f(x_0)$ signifies that $$g(y_0 + k) = g(y_0) + g'(y_0) \cdot k + \psi(k) \|k\|,$$ with $\lim_{k \to 0} \psi(k) = 0.$ Similarly, $$f(x_0 + h) = f(x_0) + f'(x_0) \cdot h + \varphi(h) \|h\|,$$ with $\lim_{h \to 0} \varphi(h) = 0.$
Plug in the expansion of $f$ into $g,$ to reach $$(g \circ f)(x_0 + h) = g(y_0) + g'(y_0) f'(x_0) \cdot h + \eta(h),$$ where $$\eta(h) = g'(y_0) \cdot \varphi(h) \|h\| + \psi(k_h) \|k_h\|$$ and $$k_h = f'(x_0) \cdot h + \varphi(h) \|h\|.$$ Observe that $$\|k_h\| \leq \|h\| \big(\|f'(x_0)\| + \|\varphi(h)\| \big) \leq 2\|f'(x_0)\| \|h\|$$ for all $h$ small enough. This entails (for $h \neq 0$) $$\dfrac{\eta(h)}{\|h\|} \leq \|g'(x_0) \cdot \varphi(h)\| + 2 \|\psi(k_h)\|$$ and since $\varphi(h) \to 0$ and $\psi(k_h) \to 0$ as $h \to 0,$ we are done. Q.E.D.
Remark. This proof works on any normed space.