I found a derivation of Normal Equation without the use of calculus from Wikipedia (Linear Least Squares). This method rewrites the Residual Sum Squared Error:
$$S(\beta)=y^{T}y-2\beta X^{T}y+\beta^{T}X^{T}X\beta$$
into:
$$S(\beta)=\left \langle \beta, \beta \right \rangle-2\left \langle \beta, (X^{T}X)^{-1}X^{T}y \right \rangle+\left \langle (X^{T}X)^{-1}X^{T}y, (X^{T}X)^{-1}X^{T}y \right \rangle+C,$$
and $\langle \cdot ,\cdot \rangle$ is the inner product defined by $$ \langle x,y\rangle =x^{\rm {T}}(\mathbf {X} ^{\rm {T}}\mathbf {X} )y. $$
I understand the idea is to rewrite $S(\beta)$ into the form of $S(\beta)=(x-a)^2+b$ such that $x$ can be solved exactly. But I do not understand how to rewrite $S(\beta)$ and what principals are used to rewrite $S(\beta)$?
You missed an important detail:
With that, you can expand $$ \left \langle \beta, \beta \right \rangle-2\left \langle \beta, (X^{T}X)^{-1}X^{T}y \right \rangle+\left \langle (X^{T}X)^{-1}X^{T}y, (X^{T}X)^{-1}X^{T}y \right \rangle = \\ \beta^TX^TX\beta -2\beta^T X^TX (X^{T}X)^{-1}X^{T}y + [(X^{T}X)^{-1}X^{T}y]^TX^TX (X^{T}X)^{-1}X^{T}y $$ and verify that this is what we started with.
(old answer below)
If we make the substitution $\gamma = X\beta$. Since $X^TX$ is positive definite, $X$ has full column-rank, so there exists a matrix $M$ with $MX = I$, so that $\beta = MX\beta = M\gamma$. In particular, let's take $M = (X^TX)^{-1}X^T$.
We then have $$ S(\gamma) = y^{T}y-2\beta^T X^{T}y+\beta^{T}X^{T}X\beta = \\ y^Ty + 2(M\gamma)^T X^Ty + \gamma^T\gamma = \\ \gamma^T\gamma + 2\gamma^T M^TX^Ty + y^Ty = \\ \langle \gamma, \gamma \rangle + 2\langle \gamma,M^TX^T y \rangle + \langle y,y \rangle =\\ \langle \gamma, \gamma \rangle + 2\langle \gamma,X(X^TX)^{-1}X^T y \rangle + \langle y,y \rangle $$