Matrix differentiation: Combination of vectors and matrices

163 Views Asked by At

I want to differentiate:

$f(w) = w^TF^TFw - w^TF^Tt- t^TFw$

with respect to w. F is a $n*d$ matrix, w is a $d*1$ vector, y is a $n*1$ vector.

I read sometimes that $(w^TF^Tt)'$ = $(F^Tt)^T$, and sometimes that it is $(F^Tt)$ - why is that?

Furthermore, I know that generally $(w^TAw)' = w^T(A+A^T)$. Should it not follow that $w^TF^TFw = w^T(F^TF+F^TF)$?

2

There are 2 best solutions below

3
On BEST ANSWER

Let's use a colon denote the inner/Frobenius product, i.e. $$A:B={\rm tr}(A^TB)$$ Write the function in terms of the Frobenius product. Then finding its differential and gradient is easy $$\eqalign{ f &= Fw:Fw - Fw:t - t:Fw \cr &= Fw:Fw - 2t:Fw \cr \cr df &= 2Fw:F\,dw - 2t:F\,dw \cr &= 2F^TFw:dw - 2F^Tt:dw \cr &= 2(F^TFw - F^Tt):dw \cr \cr \frac{\partial f}{\partial w} &= 2F^T(Fw - t) \cr\cr }$$ The cyclic properties of the trace translate into rules for rearranging the terms in a Frobenius product. For example $$\eqalign{ AB:C &= A:CB^T \cr AB:C &= B:A^TC \cr A:B &= B:A \cr }$$

0
On

In matrix differentiation, you can either differentiate with respect to w or to $w^T$.

For instance when you differentiate

  • $\frac{\partial (tw) }{\partial w} $ with respect to $w^T$, you get $t^T$ as answer.

However, if you differentiate

  • $\frac{\partial (tw) }{\partial w} $ with respect to $w$, you get $t$ as an answer.

You get your answer in the respective "format" so to say.

Your last derivation seems correct to me, but here as well you are differentiating with respect to $w^T$. Differentiating with respect wo $w$ would yield $2F^TFw$. I have found these videos quite easy to follow: https://www.youtube.com/watch?v=iWxY7VdcSH8