Effects of feature scaling on weight vectors for linear regression

1k Views Asked by Bumbble Comm At 31 Mar 2026 - 11:07

Given that linear regression or polynomial regression can be represented as:

$\textbf{w} = (X^{T}X)^{-1}X^{T}Y$

It is standard practice in machine learning to scale each column in their training sets by the maximum value in that column, forcing each column j to have values $ -1 \leq x_{j} \leq 1 $. I am having trouble figuring out how this effect translates in to the weight vector.

Lets say if we scaled everything by a constant term $c$. Then we are left with the following:

$\textbf{w} = \frac{1}{c}(X^{T}X)^{-1}X^{T}Y$

This effectively cancels out when we actually try to predict an output for a given scaled input. But how does this work out when each column is multiplied by a different scalar? Is there a certain bound it places on the weights as a result?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 26 Jan 2015 - 3:32

Exactly the same idea. Heres another way to look at it.

Lets write out the regression equation:

$Y=\alpha + \beta_1x_1 + \beta_2x_2 +\epsilon$

Let $c_1,c_2$ be the scaling of $x_1$ and $x_2$, respectively. The regression equation can then be re-written as:

$Y=\alpha + (\beta_1c_1)\frac{x_1}{c_1} + (\beta_2c_2)\frac{x_2}{c_2} +\epsilon$

Thus, scaling your variables will result in a corresponding change on each of your regression coefficients.

Effects of feature scaling on weight vectors for linear regression

There are 1 best solutions below

Related Questions in REGRESSION

Related Questions in MACHINE-LEARNING

Trending Questions

Popular # Hahtags

Popular Questions