How can I differentiate the following using Matrix Calculus?

240 Views Asked by At

The mathematical expression is like this:

$f(\mu_{q(\beta)}, \Sigma_{q(\beta)}) = \mathbf{1}_{n}^{T} \exp \left \{ X \mu_{q(\beta)} + \frac{1}{2} \operatorname{diagonal}(X \Sigma_{q(\beta)} X^{T}) \right\}$ where $\exp$ indicates element-wise exponentiation and $\operatorname{diagonal}$ is a vector with the diagonal entries of the matrix as its components. Additionally, $\mathbf{1}_{n}^{T}$ is a vector in $\mathbb{R}^{n}$ with only ones as its entries and for notational ease, I used lower-case for vectors whereas matrices are in upper-case. $\Sigma_{q(\beta)}$ is symmetric whereas $X$ is not.

And I want to differentiate this by $\mu_{q(\beta)}$ and $\Sigma_{q(\beta)}$ each. So $\frac{\partial f}{\partial \mu_{q(\beta)}}$ and $\frac{\partial f}{\partial \Sigma_{q(\beta)}}$ are what I need.

1

There are 1 best solutions below

9
On BEST ANSWER

Your elaborate subscripts are too hard to type, so I'm going to use simpler variables and write the problem as $$\eqalign{ w &= \mu_{q(\beta)} \cr S &= \Sigma_{q(\beta)} \cr y &= Xw+\frac{1}{2}\,{\rm diag}(XSX^T) \cr e &= \exp(y) \cr f &= 1^Te \cr }$$ The differentials of these quantities can be expressed using the Hadamard ($\circ$) product as $$\eqalign{ dy &= X\,dw+\frac{1}{2}\,{\rm diag}(X\,dS\,X^T) \cr de &= e\circ dy \cr df &= 1^Tde \cr &= 1^T(e\circ dy) \cr &= e^Tdy \cr &= e^T(X\,dw+\frac{1}{2}\,{\rm diag}(X\,dS\,X^T)) \cr }$$ Now set $dS=0$ and find the gradient with respect to $w$ $$\eqalign{ \frac{\partial f}{\partial w} &= e^TX \cr }$$ Working out the gradient with respect to $S$ is a bit harder $$\eqalign{ df &= \frac{1}{2}\,e^T{\rm diag}(X\,dS\,X^T)) \cr &= \frac{1}{2}\,{\rm diag}(e^T):X\,dS\,X^T \cr &= \frac{1}{2}\,E:X\,dS\,X^T \cr &= \frac{1}{2}\,X^TEX:dS \cr\cr \frac{\partial f}{\partial S} &= \frac{1}{2}\,X^TEX \cr &= \frac{1}{2}\,X^T\,{\rm diag}(\exp(y))\,X \cr }$$