Say if I have an overdetermined system $A\vec x=\vec b$, I can use the normal equations $\implies$ $A^TA\vec x=A^T\vec b$.
If I solve for $\vec x$ I will get a "solution" with an error. It says in my notes that this is the 'least squares solution' i.e it is the choice of $\vec x $which minimises the error. But this is the only choice of $\vec x$ which solves the normal equations, so how can it be the least squares solution when there is only one solution?
You're confusing two things: the set of solutions to the normal equations, and the set of vectors over which you are searching for the least squared error. The second set is just the set of all vectors $x$ in whatever vector space you're working on. The squared error varies as you choose different vectors from the second set. The goal of the least squares problem is to find the vector(s) that minimize the squared error.
The claim is that if you get a solution to the normal equations, then it also solves the least square problem, in the following sense: if you chose any other vector from the second set besides the solution to the normal equations, you would get a squared error at least as large as the squared error you get using the solution to the normal equations.