I'm new to Stackexchange so please bear with me. I'm struggling with the least squares formula. Now Wikipedia does show ways to derive the "normal equations".
But I'd like to get the same result using the Chain Rule. Which would be something like this (I don't know if it's possible for matrices, since I haven't seen this approach anywhere yet):
$$(y - Xb)^2$$
and differentiating it with respect to $b$, which would be something like:
$$2(y - Xb) * (-X)$$
It seems though, that a transpose sign is missing somewhere, since the formula should look as follows:
$$ \mathbf{-X}^{\rm T} \mathbf y+ (\mathbf X^{\rm T} \mathbf X ){\boldsymbol{b}}$$
Please help me to correct this or point out if I'm totally wrong.
Im not sure, if your calculation is correct. Usually you have to minimze the following term:
$(y-xb)' \times (y-xb)$
$=(y'-b'x') \times (y-xb)$
multipying out.´
$=y'y-y'xb-b'x'y+b'x'xb$
It is $y'xb=b'x'y$
Thus you can write $=y'y-2b'x'y+b'x'xb$
Differentiate and set the derivative equal to 0.
$-2x'y+2x'xb=0$
Now solve this equation for b.