Early on in Guillemin and Pollack, after introducing the Inverse Function Theorem, the authors state: "If $df_x$ is an isomorphism, then one can choose local coordinates around $x$ and $y$ so that $f$ appears to be the identity, $f(x_1,...,x_k)=(x_1,...,x_k)$."
They state this without any reasoning, so perhaps it should be obvious and I'm missing something critical. I suppose it makes sense why it would be true. If $df_x$ is an isomorphism of the tangent spaces $T_xX$ and $T_yY$, then by the Inverse Function Theorem, $f$ is a local diffeomorphism, and so the two spaces should look identical locally. But I would like to know why precisely this is the case. Why can we "choose local coordinates" so that this works? How do we choose these coordinates? I know that we'd have to start with parametrizations, say $\phi:U\rightarrow X$ and $\psi:U\rightarrow Y$, where $U\subset \mathbb{R}^k$, but I'm not sure where to go from there. I'm a little confused about what exactly "local coordinates" even are.
If $\phi, \psi$ are charts around $x,y$ respectively, then $\Psi:=\psi \circ f \circ \phi^{-1}$ is a diffeomorphism between open sets of $\mathbb{R}^n$ (leaving aside restrictions on domain and image). It is therefore a change of coordinates. You can then take $\Psi^{-1} \circ \psi$ as local coordinates for $y$, and then you have that the expression of $f$ in the local coordinates $(\phi,\Psi^{-1} \circ \psi)$ is $$\widetilde{f}=(\Psi^{-1} \circ \psi) \circ f \circ \phi^{-1}=\Psi^{-1} \circ (\psi \circ f \circ \phi^{-1})=\mathrm{Id}.$$