Trouble with finding variance of $(\hat{Y_{0}} - Y_{0})$ in terms of regression estimation

432 Views Asked by At

This question comes from Rice's Mathematical Statistics Ch. 14 - (14):

enter image description here

enter image description here

Values which may be of use:

$\hat{\beta_{0}} = \bar{y} - \hat{\beta_{1}}\bar{x}$

$Var(\hat{\beta_{1}}) = \frac{\sigma^{2}}{S_{xx}}$

So my problem is in finding the expression correctly. I think my issue is I'm not treating certain parameters and values in the correct way.

Attempt

$$\hat{Y} - Y_{0} = \bar{y} - \hat{\beta_{1}}\bar{x} - \beta_{0} + x_{0} \hat{\beta_{1}} - x_{0}\beta_{1} - e_{0} \\ = \bar{y} - \beta_{0} + \hat{\beta_{1}}(x_{0} - \bar{x}) - x_{0}\beta_{1} - e_{0}$$

Taking the variance:

$$Var(\hat{Y} - Y_{0}) = Var(\bar{y}) + Var(\beta_{0}) + (x_{0} - \bar{x})^{2} Var(\hat{\beta_{1}}) + Var(x_{0}\beta_{1}) + Var(e_{0}) \\ = \frac{\sigma^{2}}{n} + Var(\beta_{0}) + (x_{0} - \bar{x})^{2}\frac{\sigma^{2}}{S_{xx}} + Var(x_{0}\beta_{1}) + \sigma^{2} $$

This is where I'm stuck. I'm not sure how to treat $ Var(\beta_{0})$ and $Var(x_{0}\beta_{1})$. They are not constants, at least I don't think they are. So I'm missing an assumption I should be applying. Fortunately I do have a solution of what I'm supposed to arrive at:

enter image description here

But I can't seem to work out how to get to this final step based on what I have. What is it I'm forgetting to do?

2

There are 2 best solutions below

1
On BEST ANSWER

One assumes that the true process $$Y_0 = \beta_0 + \beta_1x_0 + e_0$$ holds, where all are constants except $e_0 \sim \mathcal{N}(0, \sigma^2)$. Assume $n$ is the number of data points.

Suppose we have an estimate $\widehat{Y}_0 = \widehat{\beta}_0 + \widehat{\beta}_1x_0$, where all are random variables, with the exception of the fixed $x_0$.

One has $$\widehat{Y}_0 - Y_0 = \widehat{\beta}_0+\widehat{\beta}_1x_0 - (\beta_0 + \beta_1x_0+e_0)\text{.}$$ For computing the variance, keeping the discussion above in mind, we have $$\text{Var}(\widehat{Y}_0 - Y_0 ) = \text{Var}(\widehat{\beta}_0+\widehat{\beta}_1x_0-e_0)$$ from ignoring the constants. Next, we have that $$\text{Var}(\widehat{\beta}_0+\widehat{\beta}_1x_0-e_0) = \text{Var}(\widehat\beta_0) + x_0^2\text{Var}(\widehat\beta_1) + \text{Var}(e_0)+2x_0\text{Cov}(\widehat\beta_0, \widehat\beta_1)-2\text{Cov}(\widehat\beta_0, e_0) - x_0\text{Cov}(\widehat\beta_1, e_0)\text{.}$$ We assume independence between the least-squares estimators and $e_0$, so these covariances are $0$ and thus $$\begin{align} \text{Var}(\widehat{\beta}_0+\widehat{\beta}_1x_0-e_0) &= \text{Var}(\widehat\beta_0) + x_0^2\text{Var}(\widehat\beta_1) + \text{Var}(e_0)+2x_0\text{Cov}(\widehat\beta_0, \widehat\beta_1) \\ &= \sigma^2\left(\dfrac{\sum x_i^2}{nS_{xx}} + \dfrac{x_0^2}{S_{xx}} + 1 - 2x_0 \cdot \dfrac{\bar{x}}{S_{xx}} \right) \end{align}$$ (The above formulas can be easily derived from some algebra based on the equations provided in the Rice text.)

Now $$\dfrac{\sum x_i^2}{nS_{xx}} = \dfrac{\sum x_i^2 - n\bar{x}^2+n\bar{x}^2}{nS_{xx}} = \dfrac{\sum (x_i - \bar{x})^2+n\bar{x}^2}{nS_{xx}} = \dfrac{S_{xx} + n\bar{x}^2}{nS_{xx}} = \dfrac{1}{n}+\dfrac{\bar{x}^2}{S_{xx}}$$ thus we obtain $$\text{Var}(\widehat{\beta}_0+\widehat{\beta}_1x_0-e_0) = \sigma^2\left(\dfrac{1}{n}+\dfrac{\bar{x}^2 + x_0^2 - 2x_0\bar{x}}{S_{xx}} + 1\right) = \sigma^2\left[1 + \dfrac{1}{n} + \dfrac{(x_0 - \bar{x})^2}{S_{xx}} \right]$$ See, for example, https://en.wikipedia.org/wiki/Mean_and_predicted_response#Predicted_response for comparison.

0
On

Best idea is to use matrix algebra here both for the derivation and expressing the results.

For example, the topic of prediction intervals is covered on Pages 136 to 137 of this educational reference.

Succinctly, quoting from the source, the variance expression for the prediction interval is given by:

${Var[Y^*- \hat β'x^∗ |\space X \space ] = \sigma^2 [1 + V_{x^*}]}$

where ${V_{x^*}}$ was previously given on Page 134 as:

${V_{x^*} = {x^*}'[X'X]^{-1}x^*}$

and for a simple Least-Squares model with one variable and an intercept term, the corresponding prediction point vector, as represented by ${x^*}$, is given by: ${[ 1 \space x^*]}$.