How to compute derivative/gradient of matrices? E.g. $w^T X^T Xw - 2w^T X^T t +t^Tt$?

1.8k Views Asked by At

How to compute derivative/gradient of matrices? E.g. $w^T X^T Xw - 2w^T X^T t +t^Tt$?

Intuitively $2w^TX^Tt$ looks like $2xy$ so the derivative would be $2X^Tt$.

But what about $w^TX^TXw$?


$X$ is $\mathbb{R}^{n \times (p+1)}$. $w,t$ are row/column vectors corresponding to "weights" and "predictors" in $w^T x=y$ sense.

The gradient is $\frac{\partial}{\partial w}$.

2

There are 2 best solutions below

2
On

$w^TX^Tt = t^TXw$ is linear: its gradient is $t^TX$.

$w^T(X^TX)w$ is quadratic: its gradient is $2X^TXw$.

Edit: consider $f(w)=w^T A w$ for a symmetric matrix $A$. Then $$ f(u+v)-f(u) = (u+v)^TA(u+v) - u^Tau = 2u^TAv + v^TAv = 2u^TAv + o(|v|), $$ so $f'(u)v = 2u^TAv$.

0
On

Let $y=(Xw-t)\,\,$ then the function is $$\eqalign{ \phi &= y^Ty \cr d\phi &= 2y^Tdy = 2y^T(X\,dw) = 2(X^Ty)^Tdw \cr \frac{\partial\phi}{\partial w} &= 2X^Ty = 2X^T(Xw-t) \cr }$$