I am studying for my linear algebra final and it was suggested to me that I learn how to derive the normal equation used in solving least squares problems. I have been looking in various places online but I haven't had any luck, probably because I don't know the topic well enough to put the notation my professor uses into what I find. Note: the notation for what I'm using is directly from my class and I don't have a textbook to reference because he doesn't use one.
Here is an example from my class and I'm trying to use it to generate a general example that I could use to derive the normal equation $A^TA\vec x=A^T\vec b$
Given this equation $N=a_0t_0+a_1t_1$, and a graph with 4 collected data points in the form $(t_i,n_i)$, I created this error function
$E_2\begin{bmatrix}a_1\\a_2\end{bmatrix}= (a_0+a_1t_1-n_1)^2+(a_0+a_1t_2-n_2)^2+(a_0+a_1t_3-n_3)+(a_0+a_1t_4-n_4)=e_i^2+e_2^2+e_3^2+e_4^2=\Vert e\Vert= \Vert A\vec x-\vec b\Vert $
The summation I have is: $$\sum_{i=1}^4 e_i^2 = (a_0+a_1t_i-n_i)^2$$
Since this is for a specific case I know I need to find a general error function to work with in order to derive the normal equation, so I created this one:
$$E_2\begin{bmatrix}a_1\\a_2\\.\\.\\.\\a_n\end{bmatrix}=\sum_{i=1}^n e_i^2 = (a_0+a_1t_i+a_2t_i^2+...a_nt_i^ n-n_i)^2$$
I'm not even sure that I need this information to do my derivation. What do I need to do?
Admitting that I properly understood the problem
Considering that you have $N$ data points $(t_i,n_i)$ you "almost" properly wrote the error function (I suppose that missing the second summation symbol is just a typo).
Let us call it $SSQ$ (standing for the sum of the squares) as $$SSQ=\sum_{i=1}^N e_i^2 =\sum_{i=1}^N (a_0+a_1t_i-n_i)^2$$ is tha quantity you want to minimize and, as usual, we shall use its derivatives with respect to the unknow parameters $(a_0,a_1)$ and set them equal to $0$.
So $$\frac{d\, SSQ}{da_0}=\sum_{i=1}^N 2(a_0+a_1t_i-n_i)=0$$ $$\frac{d\, SSQ}{da_1}=\sum_{i=1}^N 2t_i(a_0+a_1t_i-n_i)=0$$ Since the $2$ multiply every term, let us forget them and distribute the summations. $$\frac{d\, SSQ}{da_0}=0\implies a_0\sum_{i=1}^N 1+a_1\sum_{i=1}^N t_i-\sum_{i=1}^N n_i=0$$ $$\frac{d\, SSQ}{da_1}=0\implies a_0\sum_{i=1}^N t_i+a_1\sum_{i=1}^N t_i^2-\sum_{i=1}^N n_it_i=0$$
Putting the unknowns in the lhs and the remaining in the rhs, we then have $$a_0 N+a_1\sum_{i=1}^N t_i=\sum_{i=1}^N n_i$$ $$a_0 \sum_{i=1}^N t_i+a_1\sum_{i=1}^N t_i^2=\sum_{i=1}^N n_it_i$$ These are the normal equations; they are just linear and easy to solve for $a_0,a_1$.