Why does the order of elements affect the line of best fit/linear regression?

Question

Why does the order of elements affect the line of best fit/linear regression?

351 Views Asked by Bumbble Comm At 29 Mar 2026 - 7:30

Consider the following y-values ($(0,y_0),(1,y_1),...$):

$$580,382,854,193,128,901,283,294,854,490$$

Plotting the linear regression gives the following formula:

$$y = 4.5x + 475.8$$

However, switching around the order of the y-values, like so:

$$580,382,854,854,128,901,283,294,193,490$$

Gives the following line of best fit:

$$y = -35.6x + 656.1$$

Why does the order matter when it comes to a line of best fit? The elements are the same and the algorithm I am using has no interaction between x- and y-variables:

sx = 0; sy = 0; stt = 0; sts = 0;
yArray = {}; //ten numbers from above
for (i = 0; i < 10; ++i) {
    sx = sx + i;
    sy = sy + yArray[i];
}
for (i = 0; i < 10; ++i) {
    t = i - (sx / 10);
    stt = stt + (t * t);
    sts = sts + (t * yArray[i]);
}
slope = sts / stt;
intercept = (sy - (sx * slope)) / 10;

There's nothing like sx + sy or sx * sy, etc. I just don't see where the order matters here.

Original Q&A

There are 1 best solutions below

**Bumbble Comm** · Accepted Answer

The algorithm you are using must somehow assess the association between x and y. Otherwise, it can't give you a regression line. (I believe @CarlHeckman has put his finger on it in his second Comment.)

By changing the order of the y's without making a corresponding change in the order of the x's, you are destroying the bivariate nature of the data. Consider the following fake data:

 Subject:  1  2  3  4  5
 x:        2  4  6  8 10
 y1:       0  1  2  3  4

Here the x's and y's give points on an ascending line. Their correlation is 1.

But if I mix the y-values around, I destroy the 'pairing' that traces back to the subjects. Now the plotted (x, y) points no longer lie in a straight line. Their correlation is $r = 0.6$.

 Subject:  1  2  3  4  5
 x:        2  4  6  8 10
 y2:       0  3  2  1  4

And if I put the y's in reverse order, then I get a line that goes the other direction, and the correlation is -1.

 Subject:  1  2  3  4  5
 x:        2  4  6  8 10
 y3:       4  3  2  1  0

Changing the correlation changes the slope of the regression line. Here are plots of the original and changed data.

Why does the order of elements affect the line of best fit/linear regression?

There are 1 best solutions below

Related Questions in STATISTICS

Related Questions in LINEAR-REGRESSION

Trending Questions

Popular # Hahtags

Popular Questions