I'm having trouble understanding the following proof of this proposition:
If a Sylow $p$-subgroup of a finite group $G$ is normal in $G$, then it is the largest $p$-subgroup of $G$ and the only Sylow $p$-subgroup of $G$.
First, some definitions:
If $p$ is a prime number, a $\mathbf{p}$-group is a group whose order is $p^k$, with $k\in\mathbb{N}$. A Sylow $\mathbf{p}$-subgroup of a finite group $G$ is a subgroup of order $p^k$ such that $p^k$ divides $|G|$ and $p^{k+1}$ does not divide $|G|$.
Going back to the statement, the proof goes like this:
Let the Sylow $p$-subgroup $S$ be normal in $G$. If $T$ is a $p$-subgroup of $G$, then $ST\leq G$ and $|ST|=|S||T|/|S\cap T|\geq |S|$, by Third Isomorphism Theorem. (Up to this point everything is fine, since both $|S|$ and $|T|$ are powers of a prime, and $S\cap T$ is a subgroup of $T$). Hence, $|ST|=|S|$, by the choice of $S$, so that $T\subset ST=S$ (this last line is where I don't understand).
I know $ST=\langle S\cup T\rangle$ (I think Hungerford's Algebra talks about this as join of sets). Maybe that's useful to understand the above, but not sure.
In my opinion, statement of proposition and its proof is bit messy. There is no need to say largest $p$-subgroup, it simply means Sylow $p$-subgroup of $G$. In proof part, I don’t clearly see why $|ST|=|S|$ holds.
More general definition of $p$-group is following: $G$ is said to be $p$-group if $\forall g\in G$, $|g|=p^i$ for some $i\geq 0$. When order of $G$ is finite, general definition is equivalent to $|G|$ is power of $p$. For existence of non-trivial Sylow $p$-subgroup, you only need to specify $p$ divides $|G|$. It’s easy to see that $|G|=p^km$ such that $(p,m)=1$. Then any Sylow $p$-subgroup of $G$ has order $p^k$.
This proposition are mostly proved using second Sylow theorem. Suppose $T$ is a Sylow $p$-subgroup of $G$. By second Sylow theorem, $\exists g\in G$ such that $T=gSg^{-1}$. Since $S$ is normal in $G$, we have $S=gSg^{-1}$. Thus $S=T$.