Show $var(\beta_1)$ is minimum when $x_i$s are evenly distributed at the boundaries for linear regression

41 Views Asked by At

I encounter the following question

Suppose that $n$ is even and the $n$ values of $x_i$ can be selected anywhere in the interval from $a$ to $b$. Show that $var(\beta_1)$ is a minimum if $n/2$ values of $x_i$ are equal to $a$ and $n/2$ values are equal to $b$, where $\beta_1$ is from the simple linear regression $y=\beta_0 + \beta_1 x + \epsilon$.

Knowing that $var(\beta_1) = \frac{\sigma^2}{\sum(x_i - \bar{x})}$. Then need to show that $SXX=\sum(x_i - \bar{x})$ is maximized on the boundaries.

Take derivative w.r.t $x_i$, got $\partial SXX/ \partial x_i = x_i -\bar{x}$, i.e. decreasing when $x_i < \bar{x}$ and increasing when $x_i > \bar{x}$. Not so sure how to proceed?

1

There are 1 best solutions below

0
On

I can't really show why the values needs to be on the boundaries, but if we suppose $x_i$'s are either k $a$'s and (n-k) $b$'s, we attain $\bar{x} = \frac{ka+(n-k)b}{n}$. Then, $$ \text{Let } L=\sum^{n}_{i=1}(x_i-\bar{x})^2 = \frac{(a-b)^2k(n-k)}{n}\\ \frac{\partial{L}}{\partial{k}}=(a-b)^2k+\frac{(a-b)^2k^2}{n}=0 \implies k=\frac{n}{2} $$ In conclusion, we can see that the variable is maximized when $x_i$'s are either $a$ and $b$ each with $\frac{n}{2}$.

However, as I said, I'm not sure why the data should be on the boundaries. This seems to be like a binary data so it might be related to logistic regression. I hope someone else could add some ideas to my answer :)