I have a confusion on function composition in the proof below by John Lee. We define $\Phi: U \to N$ to be a local defining map for $S$, i.e. $U$ is an open subset of $M$ and $S \cap U$ is a regular level set of $\Phi$. In this case, since $S$ could contain points outside of $U$, $\Phi \circ \iota$ makes sense only on $S \cap U$. But how can we split the differential of $\Phi \circ \iota $ into $d\Phi_p \circ d\iota_p$? I am not sure how this works because now we are looking at linear maps from $T_p S$ to $d\iota_p(T_p S)$ and from $T_p U$ to $T_{\Phi(p)} N$ and each space takes functions on their respective neighborhoods, i.e. a smooth function on $S \cap U$ may no longer be smooth on $U$ so the tangent vectors on each space takes smooth functions defined on different domains.
In fact, it is not clear to me how $\Phi \circ \iota$ is smooth on $S\cap U$ to begin with.
I would greatly appreciate any help.

The inclusion of the embedded submanifold $\iota : S \to M$ is a smooth map. The composition $\Psi = \Phi \circ \iota$ is defined on set $\iota^{-1}(U) = S \cap U$. Let us be more precise concerning $\Psi$. Since $S \cap U$ is open in $S$, the restriction $\iota \mid_{S \cap U} : S \cap U \to M$ is smooth (see Proposition 2.6 (b)). Hence the map $$\iota_U : S \cap U \to U, \iota_U(x) = \iota(x) $$ is smooth because $U$ is open in $M$.1 Note that $d(\iota_U)_q = d\iota_q$ for all $q \in S \cap U$.2
So more precisely $\Psi$ stands for
$$\Psi = \Phi \circ \iota_U : U \cap S \xrightarrow{\iota_U} U \xrightarrow{\Phi} N .$$ Hence $\Psi$ is a smooth map by Proposition 2.10 (d).
Now Proposition 3.6 (b) which is a variant of the cahin rule for smooth maps between smooth manifolds applies to show $$d\Psi_p = d\phi_{\iota_U(p)} \circ d(\iota_U)_p = d\phi_p \circ d\iota_p .$$
Footnotes
This is a consequence of the following
$\phantom{x}$
Lemma. Let $f : M \to N$ be map between smooth manifolds such that $f(M)$ is contained in an open subset $U \subset N$. Then $f$ is smooth if and only if the "codomain restriction" of $f$ to $U$, i.e. the map $\tilde f : M \to U, \tilde f(x) = f(x)$, is smooth.
$\phantom{x}$
The "only if part" follows from Proposition 2.10 (c), (d). The "if part" can easily proved by using the definition of "smooth".
There is a slght abuse notation here when writing $d(\iota_U)_q = d\iota_q$. By Proposition 3.9 (The Tangent Space to an Open Submanifold) we can identify the tangent space to an open submanifold with the tangent space to the whole manifold. That is, we write $T_p(S \cap U) = T_pS$ and $T_pU = T_pN$ although these tangent space are not really equal, but only naturally isomorphic. But certainly $\require{AMScd}$ \begin{CD} T_p(S \cap U) @>{\approx}>> T_pS \\ @V{d(\iota_U)_q}VV @VV{d\iota_q}V \\ T_pU @>>{\approx}> T_pM \end{CD}
commutes since the horizontal arrows are induced by the inclusion maps $S \cap U \to S$ and $U \to M$.
By the way, the statement $T_pS = \ker d \Phi_p$ in Proposition 5.38 only makes sense if we identify $T_pS$ with a linear subspace of $T_pM$ which is again a slight abuse of notation.