I have the following data sets, and I'm asked to apply linear regression to find the optimal solution (the example is taken from a book called data analytics by T. Runkler):
$X = \{1,2,4,5,3 \}$ and $Y = \{ -1, 1,1,-1,15\}$
Now if apply linear regression to those sets (i.e. $a = (X^T \cdot X)^{-1} \cdot X^T \cdot Y $) I get $45/55 = 0.8$. But in the book the answer is 3. And moreover, in the book I have the following as a solution: "$\frac{-1 + 1+ 15+1 -1}{5} = 3$ due to symmetry". I can't understand how one arrives at this solution from the linear regression formula, and what does symmetry mean in that case? Furthermore, where does the 5 in the denominator come from? It seems to like we're calculating $\frac{\sum_{i = 1}^{n} y_i}{|Y|}$. Which is the mean of $Y$?
You have a model $Y=X\beta+\epsilon$ with an intercept. It looks like
$$\begin{pmatrix}-1\\1\\1\\-1\\15\end{pmatrix}=\begin{pmatrix}1&1\\1&2\\1&3\\1&4\\1&5\end{pmatrix}\begin{pmatrix}\beta_0\\\beta_1\end{pmatrix}+\epsilon$$
The least squares solution is given by $(X^TX)^{-1}X^TY=\begin{pmatrix}-6\\3\end{pmatrix}$, so I guess the question is asking you for the coefficient $\beta_1$, which is the coefficient of "X" as opposed to the intercept coefficient.