Matrix Partial Derivative

5.3k Views Asked by At

For this formula: $F(W, H) = \frac{1}{2}\lVert V-WH \rVert^2$

I calculated the partial derivative using matrix calculus equations in the image:

$\bigtriangledown_HF(W,H) = -(V-WH)W^T=(WH-V)W^T $

$\bigtriangledown_WF(W,H) = -(V-WH)H = (WH-V)H$

But the book give the below result:

$\bigtriangledown_HF(W,H) = W^T(WH-V)$

$\bigtriangledown_WF(W,H) = (WH-V)H^T$

could you explain:

why my answer is different from result of the book, and how to do it correctly. enter image description here

2

There are 2 best solutions below

3
On

Just write it out:

$$ F(W,H) = \frac{1}{2} ||V-WH||^2 = \frac{1}{2} (V-WH)^T(V-WH) = $$ $$= \frac{1}{2} \left(V^TV -V^TWH -H^TW^TV + H^T W^T WH\right) $$

Now hit it with the derivative: $$ F_H = \frac{1}{2} \left(0 -W^TV - W^TV + \underbrace{( W^T WH + W^T W H)}_{product rule} \right ) = W^T( WH - V)$$

2nd One:

$$F_W = \frac{1}{2} \left(0 -V \partial_W( HW) - V \partial_W(W^TH^T) + \underbrace{( WH \partial_W ( H^TW^T) + WH \partial_W( HW)}_{product rule} \right ) = ( WH - V)H^T$$

0
On

Functions like $F=\|M\|^2_F$ can be expressed in terms of the Frobenius product as $F=M:M$. Then it's a simple matter to find the differential and derivative $$\eqalign{ dF &= 2M:dM \cr \frac {\partial F} {\partial M} &= 2M \cr } $$ Applying this to your particular function $$\eqalign{ dF &= (V-WH):d(V-WH) \cr &= -(V-WH):d(WH) \cr &= (WH-V):(dWH+WdH) \cr &= (WH-V):dWH + (WH-V):WdH \cr &= (WH-V)H^T:dW + W^T(WH-V):dH \cr } $$ in the last line I've applied the rules for rearranging Frobenius products $$\eqalign{ A:BC &= B^TA:C \cr X:YZ &= XZ^T:Y \cr } $$ Holding $H$ constant is equivalent to setting $dH=0$, and yields the differential/derivative with respect to $W$ $$\eqalign{ dF &= (WH-V)H^T : dW \cr \frac {\partial F} {\partial W} &= (WH-V)H^T \cr } $$ Similarly, setting $dW=0$ yields the derivative with respect to $H$.