The following is an excerpt from Differential Topology by Guillemin and Pollack
But $f$ is certainly not the identity as stated above, so what exactly do G&P mean when the say "then one can choose local co-ordinates around $x$ and $y$ so that $f$ appears to be the identity"?
From what I gather from the commutative diagram, this 'reformulation' of the Inverse Function Theorem states that for $x \in X$, if $df_x$ is an isomorphism, then for parameterizations $\phi : U \to X$ and $\psi : U \to Y$ we have $f = \psi \circ \phi^{-1}$
But even if we have $f = \psi \circ \phi^{-1}$, $f$ need not be the idenity function. So what exactly were G&P trying to say here?

The Inverse Function Theorem that if $f:X\to Y$ is a smooth map and $df_x$ is an isomorphism, then $f$ is a local diffeomorphism at $x.$ Hence that for some $W$ open in $X,$ the restriction $f|_W$ is a diffeomorphism onto its image $f(W).$ If necessarily, we can shrink $W$ so that there exist local coordinates $\phi:U \to W$ and $\psi:U \to f(W).$ Then $f=\psi \circ \text{id}_U \circ \phi^{-1}.$
Now, on $U$ we have the coordinates $\phi^{-1}=(x_1,\ldots,x_k)$ and the coordinates $\psi^{-1}=(y_1,\ldots,y_k).$ Let $w \in W.$ Then $w=\phi(u)$ for some $u \in U.$ In coordinates we have $$u=(x_1(w),\ldots,x_k(w))=(y_1(w'),\ldots,y_k(w'))$$ for some $w' \in f(W).$ When it says that $f$ "appears to be the identity" it refers to the fact that $$f(w)=f(\phi(u))=\psi(u)=w',$$ and $w,w'$ have the same "representative" element $u \in U.$ That's why usually, in abuse of notation, we write the coordinates $(y_1,\ldots,y_k)$ in $f(W)$ as the same coordinates $(x_1,\ldots,x_k)$ in $W,$ and write $f(x_1,\ldots,x_k)=(x_1,\ldots,x_k),$ when it (strictly) should be $f(x_1,\ldots,x_k)=(y_1,\ldots,y_k).$