MIT Statistics course: multi-dimensional matrix form for Regression

102 Views Asked by At

In the lecture here at time 31:13 the instructor is explaining regression in multi-dimonsinal matrix form but he starts with an example in 1 dimension ([screenshot of the illustration is below]) and assumes if $X\hat \beta$ is the line that minimizes the distance with observations then $\|Y\|^2 = \|X\hat \beta\|^2+\|Y- X\hat \beta\|^2$. From the illustration what he actually highlights as Y is a point rather than just just one value Y so I don't know how the equations actually maps to the illustration on the board.

illustration Any help is very appreciated.

1

There are 1 best solutions below

3
On

you are confusing two different pictures that represent regression using matrices and least squares. picture number 1: you have your data points on say, a 2 dimensional plane (x,y) and you are trying to draw a line that has the "least squared" errors and those errors are quite obviously the magnitude of the line coming out of those data points perpendicular to the line you want to draw. I think you are mixing up that picture with what is drawn in here and what I'm going to elaborate on. notice, $X \hat \beta$ is a vector. to fully understand this you need to remember how a system of equations. for each data point you have y=C+Dt (I'm writing that instead of y=mx+b cause I will be using x and b somewhere else) where t and y are known and you are trying to find C and D. such C and D's that fit all the t and y's (the coordinates of your data points). notice the system of linear equations? we have Ax=b each row of A is the coefficient of C and D for each data point, x (the unknown) is C an D. b is y for each data point. in some cases this system is solvable, meaning a there is a line that goes through all the points, but what if there is no solution? this is where vector spaces come into play. when is Ax=b solvable? when b is in the column space of A. what is the best fit if Ax=b isn't solvable, in other words when b isn't in the column space of A? the projection of b onto the column space of A (it has a formula which was mentioned in the lecture you have linked) and the error itself will be a vector. in this picture X beta hat is the projection of b onto the column space, Y is b itself and epsilon is b minus its projection (the error) I highly recommend watching this this lecture of MIT 18.06: https://www.youtube.com/watch?v=osh80YCg_GM&t=378s