How to differentiate Residual sum of square

1.7k Views Asked by At

Residual sum of square (RSS) is defined as

RSS(beta) = $(y-X * beta)^t (y-X * beta)X$

While differentiating RSS(beta) w.r.t to beta to find the minimum value of the function, author reaches the conclusion that

$X^T(y-X * beta) = 0$

Where $X$ is an $N*p$ matrix, $Y$ is a $N*1$ vector and $beta$ is $p*1$ vector.

Can someone please point me how this conclusion was reached ?

Note: This is from the book Elements of statistical.

1

There are 1 best solutions below

1
On

Write the RSS in terms of the Frobenius (:) product then find its differential and gradient wrt $\beta$
$$\eqalign{ R &= (X\beta-y):(X\beta-y) \cr\cr dR &= 2\,(X\beta-y):(X\,d\beta) \cr &= 2\,X^T(X\beta-y):d\beta \cr\cr \frac{\partial R}{\partial\beta} &= 2\,X^T(X\beta-y) \cr }$$ Now set the gradient equal to zero, like the author does, and solve for $\beta$.

If you're uncomfortable with the Frobenius products in the above derivation, you can substitute the equivalent trace function, $A:B={\rm tr}(A^TB)$