Derivation of normal equation for linear least squares in matrix form

3.6k Views Asked by Bumbble Comm At 17 May 2026 - 6:01

The derivation can be found on wikipedia but it's not clear how each step follows.

We have $y=X\beta+\epsilon$, and want to minimize $\epsilon^2$. We write objective function as $S(\beta)=||y-X\beta||^2=y^Ty-y^TX\beta-\beta^TX^Ty+\beta^TX^TX\beta=y^Ty-2\beta X^Ty+\beta^TX^T X\beta $. This follows by a dimension argument, so we combine the two middle terms. Now I don't understand how the derivative is taken, since the derivation proceeds to partial derivative with respect of $\beta$, yielding $-X^Ty+X^T X\beta=0$

In the last step, what happened to the $2$? And why did $\beta^T$ disappear but the $\beta$ remain? I can guess that $-2X^Ty+2(X^tX)\beta=0$. But specifically how to take the partial derivative without respect to $\beta$ of $\beta^TX^TX \beta$?

Original Q&A

There are 1 best solutions below

Bumbble Comm On 10 Dec 2016 - 9:36

By Eq. 69 in the Matrix Cookbook (p. 10)

$\frac{\partial}{\partial\beta}(\beta^TX^Ty) = X^Ty.$

By Eq. 81 (p. 11)

$\frac{\partial}{\partial\beta}(\beta^TX^TX\beta) = (X^TX + (X^TX)^T)\beta = 2X^TX\beta.$

So you are right, there is a factor of 2:

$\frac{\partial}{\partial\beta}(y^Ty - 2\beta^TX^Ty + \beta^TX^TX\beta) = 0 - 2X^Ty + 2X^TX\beta.$

Derivation of normal equation for linear least squares in matrix form

There are 1 best solutions below

Related Questions in MATRIX-CALCULUS

Related Questions in LEAST-SQUARES

Trending Questions

Popular # Hahtags

Popular Questions