So in PCA I encountered a formulation for LSE, which is:
$$\frac{1}{2} \sum_{i=1}^N ||x_i - \tilde{x_i}||^2$$
Where $\tilde{x}$ is a "restriction" of $x$ such that only parts of the observations are "meaningful". So the LSE measures a difference between the "full model" and a reduced model.
When I looked up LSE, I did not see $\frac{1}{2}$ in the formulations.
This is a common simplification done in machine learning texts.
The answer to your question is that it doesn't matter if the $\dfrac{1}{2}$ is there or not. Why? Because the whole goal is to minimize the residual sum of squares. Let's say that the residual sum of squares without the $1/2$ is $\text{RSS}$. Then, from calculus, minimizing $\text{RSS}$ is the same as minimizing $(1/2) \cdot \text{RSS}$. Multiplication by a positive constant does not affect the location at which the minima occur.
The $(1/2)$ is usually there to simplify the differentiation from the power rule. As a simple example, suppose I wanted to find the values of $x$ which minimize $f(x) = x^2$. I could, equivalently, find the $x$ values minimizing $g(x) = \dfrac{1}{2}x^2$. Taking the derivative yields a simpler form for $g^{\prime}(x)$ than for $f^{\prime}(x)$, as the $\dfrac{1}{2}$ would "cancel out" with the $2$. Yet in both cases, if I do the usual critical-point and second-derivative tests, I end up with $x = 0$.