Will someone explain this polynomial regression equation?

113 Views Asked by At

I am in high school and I need to write a program that does polynomial regression to any degree on a set of data for a personal project. I think that this Wikipedia Article has the equation that I need to use. I need someone to explain this equation in particular. We have not covered anything like this is school. I have a basic understanding of matrixes but a lot of these symbols are new to me.

To clarify, I do not need help with the programming aspect but I need to understand the equation so that I can work with it.

3

There are 3 best solutions below

1
On

Can you clarify exactly what you do not understand?

$X^T$ is the transpose of $X$.

$X^{-1}$ is the inverse matrix of $X$.

Does this help?

0
On

Notation: $\langle x,y \rangle$ is the inner product between $x$ and $y$. It is sometimes called the dot product, denoted by $x \cdot y$. In any case we have $\langle x,y \rangle = \sum_{i=1}^n x_i y_i$.

You want an approximate solution to $Ax=b$. The trick is to choose it such that $Ax-b$ is orthogonal to the span of $A$. In other words you want $\langle Ay,Ax-b \rangle$ to be zero for every $y$. This is equivalent to $\langle y,A^T (Ax-b) \rangle$ being zero for every $y$, which is equivalent to $A^T(Ax-b)$ being zero. So $A^T Ax = A^T b$ which is your equation.

The geometry behind this idea is the following. Suppose $y$ is some other candidate for the least squares solution. Then by definition, $Ax-b$ is orthogonal to $A(x-y)=Ax-Ay$, so the Pythagorean theorem tells you that $\| Ay - b \|^2 = \| Ax - b \|^2 + \| Ax - Ay \|^2 \geq \| Ax-b \|^2$.

0
On

for polynomial regression you are trying to create a model for your data with a simple form of a sum of linear terms

imagine you have $n$ data points indexed from ${1,..i,..n}$

$y_i = a_0 + a_1x_i + a_2x_i^2 + ...+ \epsilon_i$

it is easier to see these equations in matrix form

$\vec{y} = X\vec{a} + \vec{\epsilon}$

here $\vec{y},\vec{a},\vec{\epsilon}$ are vectors of length $n$

and $X$ is a matrix which has $n$ rows and as many columns as you have coefficients. So for a quadratic that would be 3 (intercept,$x$, and $x^2$), cubic would be 4 etc

right so now you have the problem, how do we determine the coefficients ($a_0,a_1..$) that produce a best fit model to the data $y$.

One easy solution in this case is least squares regression estimation.

in brief (as this has been described countless times on this forum):

we want to minimize the squared "distance" between the model and the data,so producing a model that best fits and represents the data. Therefore:

$E=\frac{1}{n}|| X\vec{a} - \vec{y}||^2$

which is done using the standard calculus technique of computing the derivative and setting it to zero

$\nabla E=\frac{2}{N} X^T(X\vec{a}-\vec{y})=0$

$X^T X \vec{a} = X^T \vec{y}$

so your best fit $a's$ are

$\vec{a}=(X^T X)^{-1}X^T \vec{y}$

Now this is just the start. I suggest you consult Yaser Abu-Mostafa's excellent lectures on the subject, found for free here https://work.caltech.edu/telecourse.html

this is easy to put into code also. You have the final equation, collect your terms to form the X's (also called the design matrix) and compute the transpose, matrix multiplication, and inverse using pre-built functions. Or just use the lm function in R...