linear regression (linear curve fit) - method of ordinary least square derivation

138 Views Asked by At

I am trying to step through the derivation of linear regression curve fitting with ordinary least squares method and everything looks great except I am puzzled how multiple sources make the jump from step 1 to step 2 shown below?

Step 1:

$$m=\frac{\sum_i(\overline{y}-y_i)}{\sum_i(\overline{x}-x_i)}$$

Step 2

$$m=\frac{\sum_i((\overline{y}-y_i)\cdot(\overline{x}-x_i))}{\sum_i(\overline{x}- x_i)^2}$$

Source1 | Source2

When I calculate the slope with step $1$ which theoretically should be the same as step $2$ I get a divide by $0$ error because a $\sum_i(\overline{x}-x_i)$ will always yield $0$. So how is it that these two equations are equal but one yields a divide by $0$ error and one returns the correct linear slope from a cluster of points?

UPDATE: derivation show in sources are incorrect and miss-leading! Step 2 is in fact the correct answer, however both sources show that step 2 came from step 1 that is incorrect. The error step is show below (along with its corrected derivation)

$$m=\sum_i(y_i*x_i-\overline{y}*x_i+m*\overline{x}*x_i-m*x_i^2)=0$$

incorrect step in source here was to factor out $x_i$ and divide both side by $x_i$ to remove it out of the equation. This is incorrect since $x_i$ is not a constant and cannot be removed from the summation.

correct step here would have been to break out the sums and solve for $m$:

$$m=\frac{\sum_i(\overline{y}*x_i-y_i*x_i)}{\sum_i(\overline{x}*x_i-x_i^2)}$$

Reference.

2

There are 2 best solutions below

2
On BEST ANSWER

After an admittedly quick look, I think the derivations you cite are convoluted at best, and faulty at worst. I believe the step they make to get to what you are calling "Step 1" is incorrect. That expression does not follow from the prior step.

The OLS derivation on wikipedia is sound.

6
On

Your first expression yields $\frac 00$ because the sum of the $\overline y$s is the same as the sum of the $y_i$ by definition of $\overline y$, similarly for $x$.

Your second expression is not equivalent to the first. It looks like you have just multiplied numerator and denominator by $(\overline x -x_i)$, but that is inside the sum and the term that multiplies it also depends on $i$. The denominator is now a sum of squares and is positive unless all the $x_i$ are identical.