For a linear regression task I have the basic cost function $E(a,b) = \sum_{i=1}^m(y_i - f(x_i))^2$, where $f(x) = ax+b$.
I'm now looking for an analytical solution without Matrix notation. I did the partial derivatives w.r.t. $a$ and $b$, but I think I've made a mistake somewhere. My derived expression for $a$ and $b$ respectively are:
$a = \frac{\sum y_ix_i - b\sum x_i}{\sum x_i^2}$
$b = \sum y_i - a \sum x_i$
I've put $b$ into $a$ and got: $a = \frac{\sum y_ix_i - \sum y_i \sum x_i}{\sum x_i^2 - (\sum x_i)^2}$
I've coded and testet that in Python, but I only got the correct results if I would change $a$ and $b$ to:
$a = \frac{m \sum y_ix_i - b\sum x_i}{m\sum x_i^2}$
$b = \frac{\sum y_i - a \sum x_i}{m}$
which doesn't make sense to me at all. I hope somebody can help me
As you've already figured out, you are missing factor $m$. To find this, simply do a simpler case where you skip the linear term $ax$. The derivative w.r.t $b$ is $\sum_{i=1}^m 2(y_i-b) = \sum_{i=1}^m 2y_i - 2mb$ which gives you $b = (\sum_{i=1}^m y_i) / m$ i.e. the average value, while you currently get the sum.