What is the gradient of a matrix functional?

89 Views Asked by At

Given matrices $B$ and $C$, define the functional

$$f (A) := \|ABA^T-C\|_F^2$$

All matrices are $n \times n$ and $\| \cdot \|_F$ is the Frobenius norm. What is the gradient of $f$ with respect to $A$?

I calculated it as $\nabla_A f(A)=(ABA^T-C)AB$, but I'm not sure if it is right.

2

There are 2 best solutions below

0
On BEST ANSWER

Let $g(A) = ABA^T -C$, then $Dg(A)H = ABH^T+HBA^T$.

Let $h(A) = \|A\|_F^2$, if the space is real then $Dh(A)H = 2 \langle A, H \rangle$.

Since $f = h \circ g$ we have

$Df(A) = Dh(g(A)) ( Dg(A)H) = 2 \langle ABA^T -C, ABH^T+HBA^T \rangle$

Then \begin{eqnarray} Df(A) &=& 2 \operatorname{tr} ((AB^TA^T-C^T) (ABH^T+HBA^T)) \\ &=& \langle 2(AB^TA^TAB+ABA^TAB^T-C^TAB-CAB^T), H \rangle \end{eqnarray}

0
On

Alternative approach

Let us define the Frobenius product by a colon, for brevity, i.e., \begin{align} {\rm Tr}\left( A^T B C \right) := A: BC \end{align}

We will use the cyclic property of trace, e.g., \begin{align} A: BCD = B^T A: CD = B^TAD^T: C \end{align}

Let us rewrite your function in terms of Frobenius product for simplicity, \begin{align} f(A) &= \left\| ABA^T - C \right\|_F^2 \\ &\equiv ABA^T - C : ABA^T - C \end{align}

To find the gradient $\frac{\partial f}{\partial A}$, we compute the differential and then obtain the gradient \begin{align} df(A) &= 2 \left(ABA^T - C \right) : d(ABA^T) \\ &= 2 \left( ABA^T - C \right): \left( dA BA^T + ABdA^T\right)\\ &= 2 \left( ABA^T - C \right): dA BA^T + 2 \left( ABA^T - C \right): ABdA^T \\ &= 2 \left( ABA^T - C \right) \left( BA^T \right)^T : dA + 2 \left( AB \right)^T \left( ABA^T - C \right): dA^T \\ &= 2 \left( ABA^T - C \right) AB^T : dA + 2 \left( ABA^T - C \right)^T \left( AB \right): dA \end{align}

The gradient is \begin{align} \frac{\partial f(A)}{\partial A} &= 2 \left( ABA^T - C \right) AB^T + 2 \left( AB^TA^T - C^T \right) AB \end{align}

You can simplify further if you prefer. I hope this helps