Matrix differential of a trace with Hadamard product

365 Views Asked by At

I'm encountering difficulties taking the differential of the following matrix expression with respect to $S$:

$\text{logdet}(S) + \text{Tr}[C(D\odot((AS^{-1/2}B)(AS^{-1/2}B)^{T}))]$

$C$ and $D$ are symmetric and $S$ is diagonal so I mean taking the element-wise inverse of the element-wise square-root by the notation $S^{-1/2}$.

From Matrix CookBook, I know that the first term leads to $\text{Tr}(S^{-1}dS)$ and I know that I can apply the differential of the expression inside the trace term but I'm struggle with the computation of the differential because of the quadratic form coupled with the Hadamard product. I have tried to rewrite the expression by means of Hadamard and Frobenius products (that are commutative)...without success.

Then, my goal is to find the "roots" of the derivative with respect to $S$. Given the form of the expression, my intuition is that I will obtain a fixed-point expression (in the sense that it is not possible to obtain a closed-form expression in the form $\hat{S}=$ something that doesn't depend on $S$), but it's not a problem, I will solve it numerically.

Can you help me ? Thank you in advance.

1

There are 1 best solutions below

0
On BEST ANSWER

For convenience, define the auxiliary variables $$\eqalign{ E & = C\odot D \;=\; E^T \\ X &= AS^{-1/2}B \\ dX &= A\,dS^{-1/2}\,B \;= -\tfrac{1}{2}A(S^{-3/2}dS)\,B \\ s &= {\rm diag}(S) \quad\implies\quad S = {\rm Diag}(s) \\ }$$ Analyze the following scalar function, written using the Frobenius product instead of the trace. $$\eqalign{ \psi &= C:D\odot XX^T \\&= E:XX^T \\ d\psi &= E:(dX\,X^T+X\,dX^T) \\ &= (E+E^T):dX\,X^T \\ &= 2EX:dX \\ &= -EX:AS^{-3/2}dS\,B \\ &= -S^{-3/2}A^TEXB^T:dS \\ }$$ Add this to the logdet function and calculate the gradient of the combined function. $$\eqalign{ \phi &= \log\det S + \psi \\ d\phi &= \big(S^{-1} -S^{-3/2}A^TEXB^T\big):dS \\ &= {\rm diag}\big(S^{-1} -S^{-3/2}A^TEXB^T\big):ds \\ \frac{\partial\phi}{\partial s} &= {\rm diag}\big(S^{-1} -S^{-3/2}A^TEXB^T\big) \\ }$$ The remaining task is to find the vector $s$ which produces a zero gradient.


Update

The identities $$\eqalign{ {\rm diag}\big(A\;{\rm Diag}(p)\,B\big) &= (B^T\odot A)\,p \\ {\rm diag}\big(A\;{\rm Diag}(p)\big) &= {\rm diag}(A)\odot p \\ }$$ can be used to develop a fixed-point iteration from the zero-gradient condition. $$\eqalign{ \def\o{{\tt1}} P &= S^{-1/2} \quad\implies\quad p = {\rm diag}(P) = s^{\odot-1/2} \\ M &= \big(BB^T\odot A^TEA\big) \;=\; M^T \\ \\ {\rm diag}\big(S^{-1/2}) &= {\rm diag}\big(S^{-3/2}A^TEAPBB^T\big) \\ {\rm diag}\big(P^2) &= {\rm diag}\big(P^3A^TEAPBB^T\big) \\ p^{\odot 2} &= P^3 {\rm diag}\big(A^TEAPBB^T\big) \\ P^2\o &= P^3\,\big(BB^T\odot A^TEA\big)\,p \\ \o &= (PMP)\o \quad\iff\quad (PMP)^T\o=\o \\ }$$ This is a standard problem which can be solved using a modified $\sf Sinkhorn$ algorithm $$\eqalign{ q_0 &= \o \quad &\{ {\rm initialize} \} \\ r_+ &= \o\oslash Mq \quad &\{ {\rm iterate} \} \\ q_+ &= \o\oslash Mr \\ p &= (q\odot r)^{\odot 1/2} \quad &\{ {\rm solution} \} \\ s &= \o\oslash(q\odot r) \\ }$$ where $\o$ is the all-ones vector and $(\oslash)$ denotes elementwise (aka Hadamard) division.