Why these variances are calculated differently?

31 Views Asked by At

I'm studying by this book Data Analysis and Graphics Using R: An Example-Based Approach and on chapter 5, page 150 they wrote:

There are two types of predictions: prediction of points on the line, and prediction of a new data value. The SE estimates of predictions for new data values take account both of uncertainty in the line and of the variation of individual points about the line.

see this picture:

The point $(x,\hat Y)$ is the point on the line and $x^*$ is the new data I want to predict. Note the only difference between the points $(x,\hat Y)$ and $(x^*,Yˆ*)$ is the former the true true response $Y$ is given.

Following the book, in order to get the variance of the points when you know the true response $Y$, we only need to calculate the variance of the line, i.e., $\mathbb V(\beta_0 +\beta_1X_i)$ and when we don't know the true response, the variance is the variance of the line plus the variance of the noise $\epsilon_i$.

Could someone give me a formal or intuitive explanation why we don't need to take into account the variance of the noise in the first case? For me, the point I want to predict should be in the region of the possible lines plus the noise in both cases.

In another words, I don't understand why I don't need the noise counting for the variance when I have the true response $Y$. This noise $Y-\hat Y$ is for this specific line, but the lines are random, for every new line we will have another different $\epsilon$ like in the second case.