I am reading a paper on real estate price prediction and it says the following...
In OLS (Ordinary Least Squares) we estimate the model $y = B\beta + u$ where $y$ is the dependent variable $X$ is a matrix of the observations on the independent variables and $u$ is the error term. ...
The regression coefficients $\beta$ are estimated by minimizing $(y - X\beta)'(y - X\beta)$
The first part makes sense, but the second part doesn't make sense to me. It looks like something close to normal equations, but I can't reproduce it.
Why do the coefficients come from minimizing that term?