find derivative of a cost function in matrix form

564 Views Asked by At

I have a cost function $C$ of input matrix $F$ and parameter matrix $W$: $$ C(F) = \sum_{i,j}{|F(i,.)-F(j,.)|^2W_{i,j}} $$ where input matrix $F$ has size $N\times P$ ($N$ data points, each has dimension $P$). Parameter matrix $W$ has size $N \times N$ and is symmetrical i.e. $W = W^T$. $C$ can also be expressed in matrix form according to this paper (page 12): $$ C(F) = trace(F^TLF) $$ where $L = D-W$ is the Laplacian of $W$, $D$ is diagonal matrix $D(i,i)=\sum_j W(i,j)$.

Now I want to compute the derivative of $C(F)$ w.r.t $F$. I can compute $\delta C$ manually for each data point of $F$: $$ \frac{\delta C}{\delta F(i,.)}=2\sum_j{(F(i,.)-F(j,.))W_{i,j}} $$ but I wonder if there is a way to compute $\delta C/\delta F$ directly in matrix form?

1

There are 1 best solutions below

1
On BEST ANSWER

Let's use a colon (:) to denote the Frobenius product, which is a convenient infix notation for the trace, i.e. $\,\,A:B={\rm tr}(A^TB)$.

Then the cost function, differential, and gradient can be written as $$\eqalign{ C &= {\rm tr}(F^TLF) = F:LF = FF^T:L = L:FF^T \cr\cr dC &= L:(dF\,F^T+F\,dF^T) = (L+L^T):dF\,F^T = (L+L^T)F:dF \cr\cr \frac{\partial C}{\partial F} &= (L+L^T)F = 2LF \cr\cr }$$ Update

The cyclic and transpositional properties of the trace, i.e. $$\eqalign{ {\rm tr}(AB)&={\rm tr}(BA) \cr {\rm tr}(A^T)&={\rm tr}(A) \cr }$$ give rise to a variety of Frobenius product identities, e.g. $$\eqalign{ A:BC &= AC^T:B \cr &= B^TA:C \cr &= BC:A \cr }$$ The other thing to keep in mind about the Frobenius product is that the matrix on each side of the colon (:) must have the same shape. The Hadamard product has the same requirement.

In fact, the Frobenius product is just the sum of the elements in the Hadamard product $$A:B = {\rm sum}(A\odot B)$$