Finding equation of best fit line in simple linear regression

199 Views Asked by At

To find the best fit line for a set of data $(x_i,y_i)$ by minimising the sum of least squares, one requires to minimize $\frac{\partial S}{\partial \theta{_0}}$ and $\frac{\partial S}{\partial \theta{_1}}$, where \begin{equation} S = \frac{1}{n}\sum_{i=1}^n (y_i - (\theta_0 + \theta_1x_i))^2 \end{equation}
Minimizing $\frac{\partial S}{\partial \theta{_0}}$ gives (where $\bar{x}$ and $\bar{y}$ represent the means of x and y values respectively.) \begin{equation} \bar{y} - \theta_1\bar{x} = \theta_0 \end{equation} And minimizing $\frac{\partial S}{\partial \theta{_1}}$ and simplifying gives \begin{equation} \sum_{i=1}^n y_ix_i - \sum_{i=1}^n\theta_0x_i - \sum_{i=1}^n\theta_1x_i^2 = 0 \end{equation} I'm stuck here. How to proceed further to obtain $\theta_1$ in the following form? \begin{equation} \theta_1 = \frac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y})}{\sum_{i=1}^n (x_i - \bar{x})^2} \end{equation}

1

There are 1 best solutions below

1
On BEST ANSWER

Plug in $\hat{\theta}_0$ into the second derivative. Just don't forget to change signs as the derivative of $-x\theta_1$ w.r.t. $\theta_1$ is $-x$. Then you have $$ \sum x_i y_i + \sum x_i ( \bar{y} - \theta_1 \bar{x} ) + \theta_1\sum x_i ^ 2 = 0 $$ $$ \sum x_i y _i + \bar{y}\sum x_i - \theta_1\bar{x}\sum x_i + \theta_1 \sum x_i ^ 2 = 0 $$ note that $\sum x_i = n \bar{x}$ and express $\theta_1$ with all the other terms $$ \hat{\theta}_1 = \frac{ \sum x_i y _i - n\bar{y}\bar{x} }{ \sum x_i ^2 - n \bar{x}^2 } = \frac{ \sum (x_i - \bar{x}) (y _i - \bar{y}) }{ \sum ( x_i - \bar{x})^2 } $$