Linear Least Squares Noise vs. Estimator

774 Views Asked by At

Is the variance of the estimator the same as the variance of the measurement noise?

What is the difference between the estimator and the noise?

If have an estimator $P'$ and the regression vector $V_N$, Model $t_N$, can I say that $var(t_N) = \sigma_{\epsilon}^2$ where $\epsilon(k)$ is the noise?

So the estimate of $var(P') = \sigma_{\epsilon}^2 (V_N^{T} V_N)^{-1}$?

Or is the estimate of $var(P') = \frac{1}{n-1} \sum (residuals)^{2}$.

What is the difference?

1

There are 1 best solutions below

1
On

I suppose you are using the model $$Y_i = \beta_0 + \beta_1x_i + e_i,$$ where $e_i \stackrel{iid}{\sim}\mathsf{Norm}(0, \sigma).$ Then $\hat\beta_1 = r_{x,Y}S_y/s_x$ and $\hat\beta_0 = \bar Y - \hat\beta_1\bar x,$ are respective estimates of the slope $\beta_1$ and the y-intercept $\beta_0.$ (Also, $r_{x,Y}$ is the Pearson correlation between the $x$'s and the $Y$'s.)

Also $\hat \sigma^2$ is estimated by $S_{Y|x}^2 = \frac{\sum_i (Y_i - \hat Y_i)^2}{n-2},$ where $\hat Y_i = \hat\beta_0 + \hat\beta_1x_i.$ (The differences $Y_i - \hat Y_i$ are called residuals.)

The standard deviation $SD(\hat\beta_1) = \sigma_{\hat \beta_1} = S_{Y|x}\sqrt{1/(n-1)S_x^2}$ and the standard deviation $SD(\hat \beta_0)=\sigma_{\hat \beta_0} = S_{Y|x}\sqrt{\frac 1n + \frac{\bar X^2}{(n-1)S_x^2}}.$

In summary, I believe you are referring to the $e_i$ as 'noise'. Then the 'noise variance' $\sigma^2$ is estimated by $S_{Y|x}^2.$ And the variances of the estimates of the regression coefficients use this estimate, but are not exactly the same.

An alternate expression for the estimate of $\sigma^2$ is $$S_{Y|x}^2 = \frac{n-1}{n-2}S_Y^2(1-r_{x,Y}^2).$$

Intuitively if the coefficient of determination $r_{x,Y}^2 = 1,$ then the data points $(x_i, Y_i)$ are perfectly fit by the regression line. (All residuals are $0.$) So there is no 'noise variance': $S_{Y|x}^2 = 0.$ By contrast, if $r_{x,Y}^2 \approx 0,$ then the noise about the line is almost the same as the variability of the $Y_i,$ and the regression line is of no use in predicting values $Y$ from values $x.$