Suppose I have a diagonal matrix $\pmb{D}$ and a symmetric matrix $\pmb{X}$ that is not a function of $\pmb{D}$, and I wish to find the following derivative: $$ \frac{\partial}{\partial \mathrm{diag}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right), $$ in which $\mathrm{diag}(\pmb{D})$ represents the diagonal of $\pmb{D}$. I know the following derivative: $$ \frac{\partial}{\partial \mathrm{vec}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right) = \left(\pmb{D}\pmb{X} \otimes \pmb{I} \right) + \left(\pmb{I} \otimes \pmb{X}\pmb{D}\right) $$ So I guess I can find the answer by multyplying this with $\frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})}$, which should be some straightforward matrix with zeroes and ones. So my question is, (a) does this matrix have a name and can it easily be determined? and (b) isn't there some simpler way to do this?
Derivative with respect to diagonal of diagonal matrix
570 Views Asked by Bumbble Comm https://math.techqa.club/user/bumbble-comm/detail AtThere are 2 best solutions below
On
I found a solution, but will not accept this answer yet in case someone has an easier solution. My solution to find $\frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})}$ was to reconize that for $n \times n$ diagonal matrix $\pmb{D}$: $$ \pmb{D} = \sum_{i=1}^{n} \pmb{E}_i \delta_{ii}, $$ with $\pmb{E}_i$ being an $n \times n$ matrix with a 1 on the $i$th diagonal and zeroes otherwise: $$ e_{jk} = \begin{cases} 1 & \text{if } j = k = i \\ 0 & \text{otherwise} \end{cases} $$ The derivative can then be derived as: $$ \frac{\partial \mathrm{vec}\left( \pmb{D} \right)}{\partial \mathrm{diag}(\pmb{D})} = \begin{bmatrix} \mathrm{vec}(\pmb{E}_1) & \mathrm{vec}(\pmb{E}_2) & \ldots & \mathrm{vec}(\pmb{E}_n) \end{bmatrix} $$ Denoting this result $\pmb{E}^*$ I obtain: $$ \frac{\partial}{\partial \mathrm{diag}(\pmb{D})} \mathrm{vec}\left(\pmb{D}\pmb{X}\pmb{D}\right) = \left( \left(\pmb{D}\pmb{X} \otimes \pmb{I} \right) + \left(\pmb{I} \otimes \pmb{X}\pmb{D}\right) \right) \pmb{E}^* $$
Although "D" is a great mnemonic for "diagonal" it is easily confused with "derivative" operations, which also use a mnemonic "D".
Instead let's use a convention where upper/lower case letters are related by a diagonal operation: uppercase is the matrix, lowercase is the vector.
Given an arbitrary matrix $X$ and a diagonal matrix $\,A={\rm Diag}(a)$
the product in question can be expanded as $$\eqalign{ P &= AXA \cr &= X\odot aa^T \cr {\rm vec}(P) &= {\rm vec}(X)\odot{\rm vec}(aa^T) \cr v_p &= v_x\odot(a\otimes a) \cr &= V_x\,(a\otimes a) \cr }$$ where $\odot$ denotes the elementwise/Hadamard product
and $\otimes$ denotes the Kronecker product.
The differential and gradient of the vectorized product are $$\eqalign{ dv_p &= V_x\,(a\otimes da+da\otimes a) \cr &= V_x\,(a\otimes E+E\otimes a)\,da \cr \frac{\partial v_p}{\partial a} &= V_x\,(a\otimes E+E\otimes a) \cr &= {\rm Diag}\Big(v_x\Big)\,\big(a\otimes E+E\otimes a\big) \cr &= \Big(v_xe^T\Big)\odot\big(a\otimes E+E\otimes a\big) \cr &= \Big({\rm vec}(X)\,e^T\Big)\odot\big(a\otimes E+E\otimes a\big) \cr }$$ where $E$ is the identity matrix and $e$ is the vector of all ones, i.e. $\,E={\rm Diag}(e)$