Errors in estimates of intercept and slope in least squares method

Question

Errors in estimates of intercept and slope in least squares method

1.2k Views Asked by Bumbble Comm At 14 May 2026 - 4:16

We want to find the line $y = mx + c$ that best "fits" the list of points $$(x_1, y_1), (x_2, y_2), ... (x_i, y_i), ... (x_n, y_n).$$ For each point there is no uncertiainty in $x$ and each $y_i$ has uncertainty $\sigma_i$

By minimizing the squares differences

$$[y_i - (mx_i + c)]^2$$

At first for $m$ and then for $c$ and then making a system we can find $m$ and $c.$ This I understand.

My doubt is how can we find uncertainties for $m$ and $c?$

My textbook just says that we can sum the partial derivatives of $m$ with respect to each $y_i$ multiplied by $\sigma_i^2$ (the uncertainty of the given $y_i)$ and performs the calculation in just one line.

Could anyone explain in more detail how to calculate such errors? I think seeing an example with 3 points would make me understand the concept without the trouble of too much notation.

Here is the calculation of my book that gives me trouble:

$$Var[m] = \sum (\frac{dm}{dY_i})^2*\sigma_{Y_i}^2 =$$ $$= (x_i/\sigma_(Y_i)^2- \overline x / \sigma_{Y_i}^2 )^2 * \frac{1}{Var[x]^2}*\frac{1}{\sum\frac{1}{\sigma_{Y_i}^2}} =$$ $$= \frac{1}{Var[x]*\sum\frac{1}{\sigma_{Y_i}^2}}$$

At the start of the second line my book uses $x_i$ outside of any summation symbol, I think that is an error.

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2018-06-22 21:18:18

Let $X$ denote the matrix with 1s in the first column and $x_i$ in the second column. Then $(m,c) = (X^TX)^{-1}X^Ty$. Let $r$ denote the first row of $(X^TX)^{-1}X^T$, then $$m=r^T y = \sum_i r_i y_i.$$ Assuming the uncertainties of $y_i$ are uncorrelated, you get $$\sigma^2(m)=\sum_i r_i^2 \sigma^2(y_i).$$ You can do the same for $c$ using the second row of $(X^TX)^{-1}X^T$.

**Bumbble Comm** · Answer 2 · 2018-06-22 23:22:13

I could be wrong, but I doubt that a demonstration with $n = 3$ will do much for your intuition. In that case, by the time we estimate the slope and intercept, we do not have a good estimate of variability. Maybe it will help to show several regressions with $n = 20.$

Model. Suppose $x_i = i,$ for $i = 1, 2, \dots, 20.$ That is the $x$'s are just the integers from $1$ through $20.$ If the true y-intercept is $\beta_0 = 4$ and the true slope is $\beta_1 = 1.5,$ then the regression model is $$Y_i = \beta_0 + \beta_1 + e_i,$$ where the errors are independently $e_i \sim \mathsf{Norm}(\mu=0,\, \sigma=2).$

Then we can generate the $Y_i$ in R statistical software, according to this known model, as follows.

set.seed(622);  n= 20;  b0 = 4;  b1 = 1.5;  sg = 2;  x = 1:n
y = b0 + b1*x + rnorm(n, 0, sg)
plot(x, y, pch=19);  abline(a = b0, b=b1, col="green3")

The 20 points $(x_i, Y_i)$ are plotted below along with the known linear relationship $Y_i = 4 + 1.5x_i$ and the variability of the points shows the effect of the normal errors $e_i$.

Regression line. How accurately can the regression line (least squares line) recover the information about our model? In R, we find the estimated y-intercept $\hat\beta_0 = 3.519$ and the estimated slope $\hat\beta_1 = 1.551,$ as follows:

lm(y ~ x)

Call:
lm(formula = y ~ x)

Coefficients:
(Intercept)            x  
      3.519        1.551

The figure below shows the same data points and true model (green) as above, along with the regression line (dashed red). The regression line is not a perfect copy of the true model, but it is close enough to be useful. Of course, a different experiment using the same model would have different random errors $e_i$, so another experiment would give a slightly different regression line.

Distribution of estimates of slope $\beta_1.$ Essentially, your question asks for the distribution of the estimates of the y-intercept and slope. For simplicity, we focus just on the slope. We repeat the regression procedure 100,000 times. Each time we use $n=20,\, \beta_0 = 4,\, \beta_1 = 1.5,$ and $\sigma=2.$ However, on each iteration we simulate new new errors $e_i,$ in order to try to understand the variability in the distribution of $\hat\beta_1.$

 set.seed(622);  m = 10^5;  n = 20;  x = 1:n;  b0 = 4;  b1=1.5;  sg = 2
 b1.hat = replicate(m, lm(b0 + b1*x + rnorm(n,0,sg) ~ x)$coef[2])
 mean(b1.hat);  sd(b1.hat)
 ## 1.499939    # Expected(b1.hat) aprx 1.5
 ## 0.07764731  # SD(b1.hat)
 hist(b1.hat, prob=T, col="skyblue2", main="Distribution of Slope Estimates")

The simulation suggests that $E(\hat\beta_1) = \beta_1 = 1.5.$ According to statistical theory, $$SD(\hat\beta_1) = \sigma/\sqrt{(n-1)S_x^2} = 2/\sqrt{665} = 0.07756,$$ which is well-approximated (by 0.07765) in the simulation.

The histogram seems to be approximately normal in shape. More precisely,

$$\frac{\hat\beta_2-\beta_0}{S_{Y|x}/\sqrt{(n-1)S_x^2}}\sim \mathsf{T}(\nu = n-2),$$

Student's t distribution with $n-2$ degrees of freedom, where $S_{Y|x}$ is an estimate of $\sigma = 2$ (using $x$'s and $Y$'s). So the exact distribution of $\hat \beta_1$ is based on a t distribution that is nearly normal. In practical applications, the t distribution can be used to make a 95% confidence interval for the unknown true slope $\beta_1.$

Errors in estimates of intercept and slope in least squares method

There are 2 best solutions below

Related Questions in STATISTICS

Related Questions in LEAST-SQUARES

Trending Questions

Popular # Hahtags

Popular Questions