derivative with respect to a diagonal matrix

2k Views Asked by At

Had check some previous questions regarding the derivatives of diagonal matrices, but haven't found a form like this.

If $K=Wdiag(s)W^T$, in which $W$ is an m-by-n matrix, and $diag(s)$ represents an n-by-n diagonal matrix of which diagonal is represented by the vector $s$. I'm interested in the derivative of the log-determinant of K, ($\frac{\partial{ln}|K|}{\partial{s}}$), but I get stuck at solving this part: $\frac{\partial{K}}{\partial{s}}$

3

There are 3 best solutions below

0
On BEST ANSWER

Let $K=W \operatorname{diag}(s) W^T$. According to the CAS http://www.matrixcalculus.org/:

$$ \frac{\partial }{\partial s}\log(\det(K)) = \operatorname{diag}(W^T K^{-1} W) $$

So lets prove that by hand as well. By the chain rule and Jacobi's formula we have

$$\begin{aligned} \frac{\partial \log(\det(K))}{\partial s} &= \frac{\partial \log(\det(K))}{\partial \det (K)}\circ \frac{\partial\det(K)}{\partial K}\circ \frac{\partial K}{\partial s} \\ &= \frac{1}{\det (K)}\cdot \operatorname{tr}\Big(\operatorname{adj}(K)\frac{\partial K}{\partial s}\Big) = \operatorname{tr}\Big(K^{-1}\frac{\partial K}{\partial s}\Big) \\ \end{aligned}$$

Here, we need to be careful: $\frac{\partial K}{\partial s}$ is a $m\times m\times n$ tensor, and the trace collapses the first two dimensions. As Karthik Kannan showed $\frac{\partial K}{\partial s_j}=w_jw_j^T$, where $w_j$ is the $j$-th column vector of $W$. Hence

$$\begin{aligned} \operatorname{tr}\Big(K^{-1}\frac{\partial K}{\partial s}\Big) &=\operatorname{tr}\Big(K^{-1}\frac{\partial K}{\partial s_j}\Big)_j =\operatorname{tr}\Big(K^{-1}w_j w_j^T\Big)_j \\ &=\operatorname{tr}\Big(w^T_j K^{-1}w_j\Big)_j = \Big(w^T_j K^{-1}w_j\Big)_j = \operatorname{diag}(W^T K^{-1} W) \end{aligned} $$

0
On

We have $(\text{diag}(s))_{pq} = s_{p}\delta_{pq}$. So, $$\dfrac{\partial K_{ij}}{\partial s_{k}} = \dfrac{\partial}{\partial s_{k}}\sum_{p,q}W_{ip}(\text{diag}(s))_{pq}W_{jq} = \sum_{p,q}W_{ip}\delta_{pk}\delta_{pq}W_{jq} = \sum_{p}W_{ip}\delta_{pk}W_{jp} = W_{ik}W_{jk}$$

2
On

We have, with $S = \operatorname{Diag}(s)$:

$$ \begin{align} K &= W S W^T\\ dK &= W dS W^T \end{align} $$

Finding the differential and gradient of your expression, knowing that $K$ is symmetric:

$$ \eqalign{ f &= \log \det(K) \\ &= \operatorname{tr}(\log(K)) \\ df &= K^{-T} : dK \\ &= K^{-1} : W dS W^T\\ &= W^T K^{-1} W : dS\\ &= \operatorname{diag}(W^T K^{-1} W) : ds } $$

Thus we can identify:

\begin{equation} \frac{\partial f}{\partial s} = \operatorname{diag}(W^T K^{-1} W) \end{equation}


The colon used here denotes the Frobenius inner product:

$$ A:B = \operatorname{tr}(A^TB)$$

with the following properties derived from the underlying trace function

$$\eqalign{A:BC &= B^TA:C\cr &= AC^T:B\cr &= A^T:(BC)^T\cr &= BC:A \cr } $$