What data do i use to calculate the variance around a predicted value given from a simple regression equation?

111 Views Asked by At

Trying to figure out what piece of data goes where in the attached formula for calculating the variance around a predicted value given from a simple regression equation $y_d = a+b*x_d$. I have previously calculated the expected oxygen binding at an absorbance of $50$ using the following coefficients; intercept: $0.05201$, absorbance: $0.10406$. So the final regression equation is: $0.05201 + 0.10406 * 50$ which gives me $5.25501$. I am wanting to know if the final value for the equation ($5.25501$) replaces $x_d$ in the equation or if $50$ replaces $x_d$.

$$\sigma^2=\left(\frac1N+\frac{(x_d-\overline x)^2}{\sum(x_i-\overline x)^2}\right)$$

1

There are 1 best solutions below

0
On

I assume you have data $(x_i,y_i)_{1\le i\le n}$ for $n$ observations on $(x,y)$.

Consider the simple linear regression model

$$y=a+bx+\varepsilon\,,$$

where $x$ is fixed and $\varepsilon$ is random. Suppose the errors $(\varepsilon_i)_{1\le i\le n}$ are independent $N(0,\sigma^2)$ variables.

The mean response at $x=x_d$ is $$\operatorname E(y \mid x_d)=a+bx_d$$

If $(\hat a,\hat b)$ is the least squares estimator of $(a,b)$, then this mean response is estimated by

$$\widehat{\operatorname E(y \mid x_d)}=\hat a+\hat bx_d=\hat\mu_{y\mid x_d}\text{ (say) }$$

Variance of this estimator is given by

$$\operatorname{Var}\left(\hat\mu_{y\mid x_d}\right)=\sigma^2\left[\frac{1}{n}+\frac{(x_d-\overline x)^2}{\sum_{i=1}^n(x_i-\overline x)^2}\right]$$

As I understand, you have $x_d=50$. But you also need to know $\sigma^2$ to calculate this variance. If it is not known, it is estimated by its unbiased estimator $$\hat\sigma^2=\frac1{n-2}\sum_{i=1}^n (y_i-\hat a-\hat b x_i)^2\,,$$

in which case you have an estimated variance

$$\widehat{\operatorname{Var}\left(\hat\mu_{y\mid x_d}\right)}=\hat\sigma^2\left[\frac{1}{n}+\frac{(x_d-\overline x)^2}{\sum_{i=1}^n(x_i-\overline x)^2}\right]$$