Least Squares relation with statistical formula

108 Views Asked by At

Consider a Simple Linear Regression using Least Squares. So if we have 3 equations in 2 unknows, we're fitting a line Y = DX + C

$$C + Dx_1 = y_1$$ $$C + Dx_2 = y_2$$ $$C + Dx_3 = y_3$$

In Linear Algebra, projection matrix for least squares is given by $$P = A{(A^TA)}^{-1}A^TY$$ Where $[C D]^T$ is the matrix represented by $${(A^TA)}^{-1}A^TY$$

And in Statistics, Regression is given by $$ D = \frac{\sum_{i=0}^n[(X_i - \bar X)(Y_i - \bar Y)]}{\sum_{i=0}^n(X_i - \bar X)^2}$$

$$C = \bar Y - D\bar X$$

There's certainly some relation in both the equations.I want to understand how the projection formula in linear algebra evolved to give the formula in statistics.

1

There are 1 best solutions below

0
On

Let us see the above relationship in even simpler case where the model is $y_i = \beta + \epsilon_i$, i.e., you have $n$ observations of the form $\{ ( y_i, 1)\}_{i=1}^n$. Namely, $n$ equations of the form $y_i=\beta$, which is clearly over-determined system as for continuous $Y$ you'll have $n$ different solutions. As such if you consider projection without any statistical considerations, you'll construct the following projection matrix $$ H = \mathrm{1}(\mathrm{1}'\mathrm{1})^{-1}\mathrm{1}'=\frac{1}{n}J, $$
where $J$ is a matrix with all entries equal to $1$. So, your fitted values are $$ Hy = \hat{y}=\frac{1}{n}(\sum_{i=1}^ny_i,..., \sum_{i=1}^ny_i)^T, $$ namely, your fitted values are $\hat{y}_i = \bar{y}_n $ for all $i$.

Now consider the Least square approach where you are looking for the best estimator of $\beta$ that is given by $$ \hat{\beta} = (X'X)^{-1}X'y=(\mathrm{1}'\mathrm{1})^{-1}\mathrm{1}'y = \bar{y}_n, $$ as such every fitted value can be calculated by $$ \hat{y}_i = \hat{\beta}=\bar{y}_n. $$ As you see the results are identical. To show it for you case just write down the projections matrix for $$ X = \begin{pmatrix} 1 & x_1 \\ 1 & x_2 \\ : & : \\ 1 & x_n \end{pmatrix} $$ and compare each fitted value with the OLS results. It is slightly more messy, but it follows the same logic as the aforementioned illustration.