Can we apply the chain rule to the composition of functions $Df$ after $g$? i.e. apply it to $Df\circ g$?
What would $D(Df\circ g)(y)$ look like?
Any help would be appreciated
Edit: I'm trying to do $D^2(f\circ g)(x)(v,w)$ and the book states I'm supposed to get $D^2f(x)(Dg(y)v,Dg(y)w)+Df(x)D^2g(y)(v,w)$. I can derive this formula using the product rule for multilinear functions (composition is bilinear and continuous). However, the book states that we can obtain the same formula by simply differentiating the expression we get after using the chain rule.
If $f$ and $g$ are appropriately defined, we have:
$$D((Df)\circ g)(x) = [D(Df)(g(x))] \circ [Dg(x)] = [D^2 f(g(x))] \circ [Dg(x)]$$
Strictly speaking, $D^2f(g(x))$ is not a bilinear function. Rather, it's a linear function whose arguments are linear functions. However, it may be looked at as a bilinear function through the identification of $\mathcal{L}(\mathcal{L}(\Bbb R^m, \Bbb R^n), \Bbb R^n)$ with $\mathcal{L}(\Bbb R^m \times \Bbb R^n; \Bbb R^n)$ (where the latter denotes the set of bilinear maps $\Bbb R^m \times \Bbb R^n \to \Bbb R^n$).
So for instance, we would have:
$$[D((Df)\circ g)(x)](\xi,\eta) \overbrace{=}^{\text{by identification}} [[D((Df)\circ g)(x)](\xi)]\eta = [[D^2f(g(x))](Dg(x)\xi)]\eta$$