I'm trying to understand the proof of the following theorem:
Theorem 2.20 (Product Formula). If $S$ and $T$ are subgroups of a finite group $G$, then $$|ST|\, |S \cap T| = |S|\,|T|.$$
Remark. The subset $ST$ need not be a subgroup.
Proof. Define a function $\varphi: S \times T \to ST$ by $(s, t) \mapsto st$. Since $\varphi$ is a surjection, it suffices to show that if $x \in ST$, then $|\varphi^{-1}(x)| = |S \cap T|$. We show that $\varphi^{-1}(x) = \{(sd, d^{-1}t): d \in S \cap T\}$. It is clear that $\varphi^{-1}(x)$ contains the right side. For the reverse inclusion, let $(s, t), (\sigma, \tau) \in \varphi^{-1}(x)$; that is, $s, \sigma \in S$, $t, \tau \in T$, and $st = x = \sigma \tau$. Thus, $s^{-1}\sigma = t\tau^{-1} \in S \cap T$; let $d = s^{-1}\sigma = t\tau^{-1}$ denote their common value. Then $\sigma = s(s^{-1}\sigma) = sd$ and $d^{-1}t = \tau t^{-1}t = \tau$, as desired. $\quad \square$
I think they're using that \begin{equation}\tag{1} S \times T = \text{Dom} \varphi = \bigcup_{x \in \text{Im}\varphi} \varphi^{-1}(x) = \bigcup_{x \in ST} \varphi^{-1}(x), \end{equation} and so, $$|S||T|=|S \times T|=\sum_{x \in ST} |\varphi^{-1}(x)| = \sum_{x \in ST} |S \cap T| = |ST||S\cap T|.$$ But if that's what they're using, I don't understand why they say "Since $\varphi$ is a surjection, it suffices...", because to prove $(1)$ we don't need that $\phi$ is a surjection, it's valid for all functions. So, why do they say that?
While it is true that $$\bigcup_{x \in \operatorname{Im}\varphi}\varphi^{-1}(x) = \bigcup_{x \in ST}\varphi^{-1}(x),$$ even if $\varphi$ is not surjective, if you don't have surjectivity, you only get
\begin{align*} |S\times T| &= \sum_{x \in ST}|\varphi^{-1}(x)|\\ &= \sum_{x \in \operatorname{Im}\varphi}|\varphi^{-1}(x)| + \sum_{x \not\in \operatorname{Im}\varphi}|\varphi^{-1}(x)|\\ &= \sum_{x\in \operatorname{Im}\varphi}|S\cap T| + \sum_{x \not\in \operatorname{Im}\varphi}0\\ &= |\operatorname{Im}\varphi||S\cap T|. \end{align*}