why my solution to $\frac{\partial tr(ABA^{T})}{\partial A}$ is wrong?

Question

why my solution to $\frac{\partial tr(ABA^{T})}{\partial A}$ is wrong?

125 Views Asked by Bumbble Comm At 30 Mar 2026 - 6:09

Given $A$ and $B$ are matrix，I know the true answer of the derivative $\frac{\partial tr(ABA^{T})}{\partial A}=A(B+B^{T})$ However, I don't know why my solution is wrong?
Here is my solution:
" First we have, $\frac{\partial tr(ABA^{T})}{\partial A}=\frac{\partial tr(A^{T}AB)}{\partial A}$.
According to chain rule, $\frac{\partial tr(A^{T}AB)}{\partial A}=\frac{\partial tr(A^{T}AB)}{\partial (A^{T}A)}\cdot \frac{\partial A^{T}A}{\partial A}$
As $\frac{\partial tr(AB)}{\partial A}=B^{T}$, hence $\frac{\partial tr(A^{T}AB)}{\partial (A^{T}A)}=B^{T}$
And I know $\frac{\partial A^{T}A}{\partial A}$ is a supermatrix. So, the results must not be $A(B+B^{T})$ "
Could anyone tell me which steps are wrong?

Original Q&A

There are 2 best solutions below

**Bumbble Comm** · Answer 1 · 2018-10-19 18:55:41

One good way to find derivatives of scalar functions of matrices is by use of first order approximation. Let $f(X) = Tr(XBX^\top)$ be our function. Let $Z = X+\Delta X$. Then, $$f(Z) \approx f(X)+<\Delta X,\nabla f(X)>$$ where $<X,Y> = Tr(X^\top Y) = Tr(Y^\top X)$. Now, let's calculate $f(Z)$ directly. $$f(Z) = Tr((X+\Delta X)B(X+\Delta X)^\top)=Tr(XBX^\top)+Tr(XB\Delta X^\top)+Tr(\Delta XBX^\top)+Tr(\Delta XB\Delta X^\top).$$ Last term is quadratic in $\Delta X$ and can be ignored for our first order approximation. Hence, $$f(Z) \approx f(X) + Tr(\Delta X^\top X(B+B^\top)) = f(X)+<\Delta X,X(B+B^\top)>.$$ Comparing with the definition of first-order approximation above we conclude $\nabla f(X) = X(B+B^\top)$.

**Bumbble Comm** · Answer 2 · 2020-09-17 15:39:43

Let's write the expression in terms of the Frobenius Product for simplicity:

$$\text{Tr}(ABA^T) = AB:A$$

Then, we have:

\begin{equation} \begin{split} d\text{Tr}(ABA^T) & = d(AB):A + AB:dA \\ & = ((dA)B + A(dB)):A + AB:dA \\ & = (dA)B:A + AB:dA \\ & = A:(dA)B + AB:dA \\ & = AB^T:(dA) + AB:dA \\ & = (AB^T + AB):dA \\ & = A(B^T + B):dA \\ \end{split} \end{equation}

Finally, we get:

$$ \frac{\partial \text{Tr}(ABA^T)}{\partial A} = A(B + B^T)$$

In above, I used the following properties:

$$ A:BC = B^TA:C = AC^T:B$$

and

$$(A + B):C = A:C + B:C$$

why my solution to $\frac{\partial tr(ABA^{T})}{\partial A}$ is wrong?

There are 2 best solutions below

Related Questions in MATRICES

Related Questions in PARTIAL-DERIVATIVE

Trending Questions

Popular # Hahtags

Popular Questions