In class we learned the following variant of the Implicit Function Theorem:
Suppose $f:U \to \mathbb{R}^{n-k}$, where $U \subseteq \mathbb{R}^n$, is such that $Df(p)$ has full row rank for all $p \in U$. Then there exist diffeomorphisms $\alpha, \beta$ s.t. $\alpha: U' \to U$ and $\beta: F(U) \to W$, and $\beta \circ f \circ \alpha = \pi$, the normal orthogonal projection $\mathbb{R}^n \to \mathbb{R}^{n-k}$.
My Question: Why do we need $\beta$? WLOG suppose that the first $n-k$ columns of $Df(p)$ are linearly independent. Then the function $F:(x_1, \dotsc, x_n) \mapsto (f_1, \dotsc, f_{n-k}, x_{n-k+1}, \dotsc, x_n)$ has an inverse in a neighborhood of $p$, so
$$f \circ F^{-1}: (f_1, \dotsc, f_{n-k}, x_{n-k+1}, \dotsc, x_n) \mapsto (f_1, \dotsc, f_{n-k}).$$
So why do we need $\beta$?
You are right: $\beta$ is unnecessary. It probably got into the statement from the more general rank theorem, where it is needed.
Here is another way to see this. The statement says that $f = \beta^{-1}\circ \pi \circ \alpha^{-1}$. Informally, we shuffle the domain, then project, then shuffle the image. But whatever shuffling we want to do after the projection could just as well be done before it, by moving the fibers of the projection map. Formally speaking, $\beta^{-1}\circ \pi $ can be written as $\pi \circ \gamma$ where $\gamma$ is a diffeomorphism of the cylinder domain lying above the domain of $\beta^{-1}$. Just let $\gamma$ keep the $k$ vertical coordinates as they are.