2nd order matrix derivative in OLS derivation

636 Views Asked by At

I am trying to derive the ordinary least squares formula using matrices.

The residual sum of squares is given by $(y - X\beta)(y - X\beta)^T$. I expanded this out to $yy^T - 2y\beta^Tx^T + x\beta\beta^Tx^T$. Now I want to take the derivative of this expression with respect to $\beta$.

I know how to take the derivative of the first two terms, but how do I take the derivative of the $x\beta\beta^Tx^T$ term? Thanks!

2

There are 2 best solutions below

0
On BEST ANSWER

Your'e slightly wrong, note that \begin{align} S(\beta) &= (y-X\beta)'(y-X\beta)\\ &=y'y + \beta'X'X\beta - 2\beta X'y, \end{align} now, note that $X'X=A$ is a square matrix of size order $p+1$, thus $\beta' X'X\beta = \beta' A \beta$ is quadratic form, hence $$ \beta' A \beta = \sum_j\sum_i \beta_j \beta_ia_{ij} = \sum\beta_j^2a_{jj} + 2\sum_{i < j}\beta_i \beta_j a_{ij}, $$ taking derivative w.r.t. $\beta$ you'll have that $$ \frac{\partial}{\partial \beta} (\beta' A \beta) = 2\sum_j \beta_j a_{jj} + 2 \sum_{i < j} \beta_ja_{ij} = 2A\beta, $$ i.e., $$ \frac{\partial}{\partial \beta} (\beta' X'X \beta) = 2X'X\beta. $$

0
On

An alternate approach is to take the derivative first, and expand afterwards.
That way you only have to differentiate a single term.

Let $w=(X\beta-y)$, then jot down the function, the differential and the gradient $$\eqalign{ S &= w:w \cr dS &= 2w:dw = 2w:X\,d\beta = 2X^Tw:d\beta \cr \frac{\partial S}{\partial\beta} &= 2X^Tw = 2X^T(X\beta-y) \cr }$$ where a colon represents the trace/Frobenius product, i.e. $\,A:B={\rm tr}(A^TB)$