Normalization in Linear Regression

302 Views Asked by At

In linear regression problems it is important not to have a curve that overfits the input data or training examples. In other words, the curve should generalise your training data so you can predict new values. Sometimes it is necessary to apply normalisation to the input data so all our features of each training example are in the same range of values (i.e. [-1 1]). By doing so we can get our cost minimisation algorithm like Gradient Descent to converge to the minimum with less iterations. I can easily visualise this effect when running Gradient Descent but, and here is my question: Why when implementing an analytical approach (not iterative) such as the Normal Equation it is not necessary to normalise the input data? When testing new examples with the predicted hypothesis I am obtaining the same results with normalised Gradient Descent and unnormalised Normal Equation. Why?

1

There are 1 best solutions below

1
On BEST ANSWER

The normal equation gives the exact result that is approximated by the gradient descent. This is why you have the same results.

However, I think that in cases where features are very correlated, that is when the matrix $X'X$ is bad conditioned, then you may have numeric issues with the inversion that can be made less dramatic as soon as you normalize the features.