I'm reading ESE, and i have a problem with this sentence:
Orthogonal inputs occur most often with balanced, designed experiments (where orthogonality is enforced), but almost never with observational data. Hence we will have to orthogonalize them in order to carry this idea further. Suppose next that we have an intercept and a single input x. Then the least squares coefficient of x has the form:
$\beta_1 = \frac{\left \langle \mathbf{x} -\overline{x}\mathbf{1},y\right \rangle }{\left \langle \mathbf{x} -\overline{x}\mathbf{1},\mathbf{x} -\overline{x}\mathbf{1}\right \rangle}$
where $\overline{x}$ is the mean of the vector's components, and $\mathbf{1} = x_0$, the vector of N ones.
Ok, i know that $\beta = \frac{\left \langle \mathbf{x} ,y\right \rangle }{\left \langle \mathbf{x} ,\mathbf{x} \right \rangle}$ is valid for an univariate model with no intercept, but what about $\beta_1$ ? Can you tell me how it is obtained $\beta_1$ ? I really don't understand...
Ty so much!
The solution to the OLS problem is given by $\beta=(X^\textrm{T}X)^{-1}X^\textrm{T}y$ where $X$ is the regressor matrix (including ones) and $y$ the dependent observations. In your case: $$\begin{bmatrix}\beta_0 \\ \beta_1\end{bmatrix}=\bigg(\begin{bmatrix} 1 & x_1 \\ 1 & x_2 \\ ... & ... \\ 1 &x_N\end{bmatrix}^{\textrm{T}}\begin{bmatrix} 1 & x_1 \\ 1 & x_2 \\ ... & ... \\ 1 &x_N\end{bmatrix}\bigg)^{-1}\begin{bmatrix} 1 & x_1 \\ 1 & x_2 \\ ... & ... \\ 1 &x_N\end{bmatrix}^{\textrm{T}}\begin{bmatrix}y_1 \\ y_2 \\ ... \\ y_N\end{bmatrix}= \\ = \bigg(\begin{bmatrix}N & \sum_{n=1}^Nx_n \\ \sum_{n=1}^Nx_n & \sum_{n=1}^Nx_n^2\end{bmatrix} \bigg)^{-1}\begin{bmatrix}\sum_{n=1}^Ny_n\\ \sum_{n=1}^Nx_ny_n\end{bmatrix} = \\ =\frac{1}{N\sum_{n=1}^Nx_n^2-\bigg(\sum_{n=1}^Nx_n\bigg)^2}\begin{bmatrix} \sum_{n=1}^Nx_n^2 & -\sum_{n=1}^Nx_n \\ -\sum_{n=1}^Nx_n & N\end{bmatrix}\begin{bmatrix}\sum_{n=1}^Ny_n\\ \sum_{n=1}^Nx_ny_n\end{bmatrix} = \\ = \frac{1}{N\sum_{n=1}^Nx_n^2-\bigg(\sum_{n=1}^Nx_n\bigg)^2}\begin{bmatrix}\bigg(\sum_{n=1}^Ny_n\bigg)\bigg(\sum_{n=1}^Nx_n^2\bigg)-\bigg(\sum_{n=1}^Nx_n\bigg)\bigg(\sum_{n=1}^Nx_ny_n\bigg)\\ N\sum_{n=1}^Nx_ny_n-\bigg(\sum_{n=1}^Ny_n\bigg)\bigg(\sum_{n=1}^Nx_n\bigg)\end{bmatrix}= \\ =\begin{bmatrix}\frac{\bigg(\sum_{n=1}^Ny_n\bigg)\bigg(\sum_{n=1}^Nx_n^2\bigg)-\bigg(\sum_{n=1}^Nx_n\bigg)\bigg(\sum_{n=1}^Nx_ny_n\bigg)}{N\sum_{n=1}^Nx_n^2-\bigg(\sum_{n=1}^Nx_n\bigg)^2}\\ \frac{N\sum_{n=1}^Nx_ny_n-\bigg(\sum_{n=1}^Ny_n\bigg)\bigg(\sum_{n=1}^Nx_n\bigg)}{N\sum_{n=1}^Nx_n^2-\bigg(\sum_{n=1}^Nx_n\bigg)^2}\end{bmatrix}= \\ =\begin{bmatrix}\frac{\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_n^2\bigg)-\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_ny_n\bigg)}{\frac{1}{N}\sum_{n=1}^Nx_n^2-\bigg(\frac{1}{N}\sum_{n=1}^Nx_n\bigg)^2}\\ \frac{\frac{1}{N}\sum_{n=1}^Nx_ny_n-\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)}{\frac{1}{N}\sum_{n=1}^Nx_n^2-\bigg(\frac{1}{N}\sum_{n=1}^Nx_n\bigg)^2}\end{bmatrix}$$ You'll notice that the formula for $\beta_1$ is equivalent to yours. For $\beta_0$ we can simplify: $$\frac{\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_n^2\bigg)-\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_ny_n\bigg)-\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)^2+\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)^2}{\frac{1}{N}\sum_{n=1}^Nx_n^2-\bigg(\frac{1}{N}\sum_{n=1}^Nx_n\bigg)^2}= \\ =\frac{\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\bigg[\frac{1}{N}\bigg(\sum_{n=1}^Nx_n^2\bigg)-\bigg(\frac{1}{N}\sum_{n=1}^Nx_n\bigg)^2\bigg]-\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)\bigg[\frac{1}{N}\bigg(\sum_{n=1}^Nx_ny_n\bigg)-\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)\frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)\bigg]}{\frac{1}{N}\sum_{n=1}^Nx_n^2-\bigg(\frac{1}{N}\sum_{n=1}^Nx_n\bigg)^2}= \\ = \frac{1}{N}\bigg(\sum_{n=1}^Ny_n\bigg)-\beta_1\frac{1}{N}\bigg(\sum_{n=1}^Nx_n\bigg)$$