Understanding linear regression with more than one independent variable

75 Views Asked by At

Linear regression(least square approximation) is very useful in prediction various events. However, I always had the problem that I did not fully understand the math behind it(I don't have a lot of experience in math). To get a better understanding of it, I am going to build a program that computes it for me. At the moment I only understand how to compute it with a dataset that consists of 1 independent variable and 1 dependent variable(simple linear regression), but still, I do not understand the idea behind all of the formula. To provide more details of what I don't understand, I am going to give an example.

Let's say we have the following table:

X      Y
4      5
10     11

My understanding of simple linear regression is that the goal is to find a line that best fits between all of the points. Let's start with the formula of a line:

$y = b_0 + b_1*x$

where $b_0$ is the intercept and $b_1$ is slope

For computing the slope I found the following formula on the Internet: enter image description here

However, I don't really understand why this formula works. Why do you have to multiply?

For computing the intercept, I also found the following formula: $b_0$ = mean(y) $-$ $b_1$*mean(y)

For computing both of the parameters, we first have to compute the mean of both of the columns.

meanY = $(4+10)/2 = 7$

meanX = $(5+11)/2 = 8$

Now that we obtained the means, we can calculate the slope of the line.

$ \Sigma{((x-meanX)^2)} = (5-8)^2 + (11-8)^2 = 9 + 9 = 18$

$ \Sigma{((x-meanX)(y-meanY)} = ((5-8)*(4-7)) + ((11-8)(10-7)) = 18$

Which means that the slope($b_1$) is equal to $18/18=1$. So, for now we have that $y = b_0 + 1x$

Then we have to fill in the formula of the intercept: $B_0 = 7 - 1*8 = -1$

This means that we end with the following equation: $y = -1 + 1*x$ And the linear regression function would look like $f(x) = x - 1 $

My problem is that I am not sure how to calculate manually the parameters for a dataset that consists of more than one independent variable. Is there like a pattern I can follow to do the same for let's say 3 independent variables?