Derivative with respect to diagonal of diagonal matrix

570 Views Asked by At

Suppose I have a diagonal matrix $\pmb{D}$ and a symmetric matrix $\pmb{X}$ that is not a function of $\pmb{D}$, and I wish to find the following derivative: $$ \frac{\partial}{\partial \mathrm{diag}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right), $$ in which $\mathrm{diag}(\pmb{D})$ represents the diagonal of $\pmb{D}$. I know the following derivative: $$ \frac{\partial}{\partial \mathrm{vec}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right) = \left(\pmb{D}\pmb{X} \otimes \pmb{I} \right) + \left(\pmb{I} \otimes \pmb{X}\pmb{D}\right) $$ So I guess I can find the answer by multyplying this with $\frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})}$, which should be some straightforward matrix with zeroes and ones. So my question is, (a) does this matrix have a name and can it easily be determined? and (b) isn't there some simpler way to do this?

2

There are 2 best solutions below

0
On BEST ANSWER

Although "D" is a great mnemonic for "diagonal" it is easily confused with "derivative" operations, which also use a mnemonic "D".

Instead let's use a convention where upper/lower case letters are related by a diagonal operation: uppercase is the matrix, lowercase is the vector.

Given an arbitrary matrix $X$ and a diagonal matrix $\,A={\rm Diag}(a)$
the product in question can be expanded as $$\eqalign{ P &= AXA \cr &= X\odot aa^T \cr {\rm vec}(P) &= {\rm vec}(X)\odot{\rm vec}(aa^T) \cr v_p &= v_x\odot(a\otimes a) \cr &= V_x\,(a\otimes a) \cr }$$ where $\odot$ denotes the elementwise/Hadamard product
and $\otimes$ denotes the Kronecker product.

The differential and gradient of the vectorized product are $$\eqalign{ dv_p &= V_x\,(a\otimes da+da\otimes a) \cr &= V_x\,(a\otimes E+E\otimes a)\,da \cr \frac{\partial v_p}{\partial a} &= V_x\,(a\otimes E+E\otimes a) \cr &= {\rm Diag}\Big(v_x\Big)\,\big(a\otimes E+E\otimes a\big) \cr &= \Big(v_xe^T\Big)\odot\big(a\otimes E+E\otimes a\big) \cr &= \Big({\rm vec}(X)\,e^T\Big)\odot\big(a\otimes E+E\otimes a\big) \cr }$$ where $E$ is the identity matrix and $e$ is the vector of all ones, i.e. $\,E={\rm Diag}(e)$

0
On

I found a solution, but will not accept this answer yet in case someone has an easier solution. My solution to find $\frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})}$ was to reconize that for $n \times n$ diagonal matrix $\pmb{D}$: $$ \pmb{D} = \sum_{i=1}^{n} \pmb{E}_i \delta_{ii}, $$ with $\pmb{E}_i$ being an $n \times n$ matrix with a 1 on the $i$th diagonal and zeroes otherwise: $$ e_{jk} = \begin{cases} 1 & \text{if } j = k = i \\ 0 & \text{otherwise} \end{cases} $$ The derivative can then be derived as: $$ \frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})} = \begin{bmatrix} \mathrm{vec}(\pmb{E}_1) & \mathrm{vec}(\pmb{E}_2) & \ldots & \mathrm{vec}(\pmb{E}_n) \end{bmatrix} $$ Denoting this result $\pmb{E}^*$ I obtain: $$ \frac{\partial}{\partial \mathrm{diag}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right) = \left( \left(\pmb{D}\pmb{X} \otimes \pmb{I} \right) + \left(\pmb{I} \otimes \pmb{X}\pmb{D}\right) \right) \pmb{E}^* $$