So given three surfaces $S_1,S_2,S_3$ in $\mathbb{R}^n$ (I am currently doing differential geometry of curves and surfaces, so I am not really familiar with most manifold jargon. I assume the proper terminology would be something to the effect of smooth manifolds embedded in $\mathbb{R}^n$) and a chain of smooth maps: $$ S_1\xrightarrow[]{\phi} S_2 \xrightarrow{\psi}S_3 $$ I would like to prove the chain rule for the differentials of $\phi,\psi$. That is, for a point $p$ in $S_1$, we have: $$ d(\psi \circ \phi )_p = d\psi_{\phi(p)} \circ d\phi_p $$ My book's definition of the differential for a function $ \phi: S_1 \to S_2$ is a map $d\phi_p:T_pS_1 \to T_\phi(p)S_2$ and is defined by: $$ d\phi_p = (\phi \circ \gamma )'(0) $$
where $\gamma$ is a smooth curve, $\gamma:(-\epsilon,\epsilon)\to S_1$ with the property that $\gamma(0) = p$. Using this definition, computing the differential of the composition at $p$ comes down to computing: $$ (\psi \circ \phi \circ \gamma)'(0) $$ However, here is where I am a bit unsure. Let us define $\beta = \phi \circ \gamma$. Then, we have that $\beta$ is a smooth curve from $\mathbb{R}$ into $S_2$, with $\beta(0) = \phi(p)$. Then, we conclude: $$ d(\psi \circ \phi )_p = (\psi \circ \phi \circ \gamma)'(0) = (\psi \circ \beta)'(0) = d\psi_{\phi(p)} $$ But this is clearly incorrect. What am I doing wrong?
The correct definition for the push-forward of $\phi$ is $$d\phi_p : T_p S_1 \to T_{\phi(p)}S_2, \qquad d\phi_p(\gamma'(0)) = (\phi \circ \gamma)'(0) \tag1 $$ where, as you say correctly, $\gamma : (-\varepsilon,+\varepsilon) \to S_1$ is a smooth curve such that $\gamma(0) = p$. What is different from your definition is that we do need to know what the differential acts upon (the vector $\gamma'(0) \in T_pS_1$). In your final calculation, this amounts to the following correction $$d(\psi \circ \phi)_{p}(\gamma'(0)) = (\psi \circ \phi \circ \gamma)'(0) = (\psi \circ \beta)'(0) = d\psi_{\phi(p)}(\beta'(0)).\tag2 $$ This is actually correct! Why? Because you know that $\beta = \phi \circ \gamma$, so by the definition $(1)$ $$\beta'(0) = (\phi \circ \gamma)'(0) = d\phi_p(\gamma'(0));$$ this means that $(2)$ becomes $$d(\psi \circ \phi)_{p}(\gamma'(0)) = d\psi_{\phi(p)}(\beta'(0)) = d\psi_{\phi(p)}(d\phi_p(\gamma'(0))) = (d\psi_{\phi(p)} \circ d\phi_p)(\gamma'(0)), $$ exactly as you wanted.