In Machine Learning we use a cost function such as least squared errors to evaluate how good the model is and if one model has a better score than the other, assuming that it does not overfit we choose said model.
But in statistics, R-squared seems to be favored in model selection and not Least Squares.
What's the point of R-squared/Adjusted R-squared when we have Least squared to measure performance in general?
What am I missing? or am I just confused?
$R^2$ is generally used for linear regression, while $MSE$ can be for arbitrary functions in high dimensions... the general domain of neural nets, for instance.