find matrix gradient

47 Views Asked by At

$f: \Omega \rightarrow \mathbb{R}$ defined by

$\Omega=\left\{X \in \mathbb{R}^{m \times n}: X^{\top} A X+B^{\top} X+X^{\top} B+C \succ 0\right\}$

with $A \in \mathbb{S}^{m}, B \in \mathbb{R}^{m \times n}, C \in \mathbb{S}^{n}$ arbitrary;

$\qquad f(X)=\log \operatorname{det}\left(X^{\top} A X+B^{\top} X+X^{\top} B+C\right)$

Find the gradient.

1

There are 1 best solutions below

0
On

$\def\p#1#2{\frac{\partial #1}{\partial #2}}\def\B{\big}\def\L{\left}\def\R{\right}\def\o{\operatorname}$Let's use a colon to denote the trace/Frobenius product, i.e. $$\eqalign{ A:B &= \o{Tr}\L(A^TB\R) \\ A:A &= \big\|A\big\|_F^2 \\ }$$ And for typing convenience, define the symmetric matrix $$\eqalign{ M &= X^TAX + B^TX + C \\ dM &= 2\operatorname{Sym}\L(X^TA\,dX\R) + B\,dX \\ }$$ where $\;\operatorname{Sym}\L(X\R)\doteq\tfrac 12\L(X+X^T\R)$

Write the function using the notation above. Then calculate its differential and gradient. $$\eqalign{ f &= \log\det(M) \\ df&= d\operatorname{Tr}(\log M) \\ &= M^{-1}:dM \\ &= M^{-1}:\big(dX^TAX + X^TA\,dX + B^TdX\big) \\ &= M^{-1}X^TA:dX^T + AXM^{-1}:dX \,+\, BM^{-1}:dX \\ &= \B(2AX+B\B)M^{-1}:dX \\ &= \B(2AX+B\B)\,\B(X^TAX + B^TX + C\B)^{-1}:dX \\ \p{f}{X} &= \B(2AX+B\B)\,\B(X^TAX + B^TX + C\B)^{-1} \\\\ }$$


The first step in the derivation makes use of Jacobi's formula.

The next step uses a formula from the Matrix Cookbook $$\frac{d}{dX}\o{Tr}(f(X)) = f'(X)^{T}$$ And subsequent steps make use of the fact that the terms in a Frobenius product can be rearranged in many equivalent ways, due to the properties of the underlying trace function, e.g. $$\eqalign{ A:B &= B:A = B^T:A^T \\ CA:B &= C:BA^T = A:C^TB \\ }$$