How to differentiate sum of matrix multiplication?

400 Views Asked by At

I try to solve the cost function minimization by differentiate matrix $A$. However, A is in the sum and at the middle of the product:

Solve $\frac{\delta F(A)}{\delta A} = 0$,

$0 = \frac{\delta }{\delta A} [\sum_{k} log (\sum_{c}S_{pc}A_{ck})H_{kn} ]$

where matrices $S$ is (p x c), $A$ is (c x k), and $H$ is (k x n)

I don't think I can apply just chain rule for $log(S_{pc}A_{ck}H_{kn}) = \frac{1}{S_{pc}A_{ck}H_{kn}} (S_{pc}A_{ck}H_{kn})'. $ How to deal with the sum before multiplication for matrix differentiation?

1

There are 1 best solutions below

2
On BEST ANSWER

The gradient of a matrix-valued function with respect to a matrix argument will be a 4th order tensor. To avoid that difficulty, let's vectorize the matrix terms.

First, define some variables $$\eqalign{ B &= SAH \cr a &= {\rm vec}(A) \cr b &= {\rm vec}(B) &= (H^T\otimes S)\,a \cr y &= 1/b &\implies 1=y\circ b \cr Y &= {\rm Diag}(y) &\implies 1=Yb \cr f &= {\rm vec}(F) &= \log(b) \cr }$$ Then find the differential and gradient of the function $$\eqalign{ df &= y\circ db = Y(H^T\otimes S)\,da \cr \frac{\partial f}{\partial a} &= Y(H^T\otimes S) \cr }$$