Ridge Regression Centering Proof

510 Views Asked by At

This is a ridge regression problem. The following two problems are equivalent:

$(w_t, b_\lambda ) = argmin_{w,b}\{\sum_{i=1}^m (y_i-b-w^Tx_i)^2+\lambda w^Tw\} $

$(w_t, b_\lambda )= argmin_{w,b}\{\sum_{i=1}^m (y_i-b-w^T(x_i-\bar x))^2+\lambda w^Tw\} $

where:

  • $\bar x$ is the average of the input data.
  • $\lambda$ defines a trade-off between the error on the data and the norm of the vector $w$ (degree of regularization)
  • (I'm assuming $b$ is a bias term)

I can't work out why, mathematically, centering the data (which is what I assume is happening here), has no effect.

Not looking for the answer, just a push in the right direction. Intuitively, I expect it's because simply scaling back each data point will have no effect on the final $w$ vector, because it's about the relationship between the data points. Showing this mathematically, however, I'm unsure.