Chain rule for matrix derivatives

497 Views Asked by At

I have a function $f: \mathbb{R}^{n \times n} \to \mathbb{R}$, where $\mathbb{R}^{n \times n}$ denotes the set of $n \times n $ real matrices. I have a closed form expression for $$ g(A) := \frac{\partial}{\partial A} f(A). $$

The goal is to calculate $\frac{\partial}{\partial B} f(C'BC)$ where $B$ and $C$ are $k \times k$ and $k \times n$ matrices, respectively (so that $C'BC$ is a $n \times n $ matrix as it should be). I am thinking it must equal $ C'g(C'BC)C $ but I want to make sure and get some reference for this chain rule. Thanks a lot for your help.

[EDIT] It seems clear that my conjecture is wrong. Any help to get me on the correct path would be greatly appreciated.

2

There are 2 best solutions below

0
On BEST ANSWER

Define $A:\mathbb{R}^{k\times k}\rightarrow\mathbb{R}^{n\times n}$ as $A(B)=C'BC$

Denote also $g(A)=[g_{ij}(A)],~A=[a_{ij}],~c=[c_{ij}]$

Then \begin{align} \dfrac{\partial}{\partial B}f(C'BC)=& \dfrac{\partial}{\partial B}(f\circ A)(B)\\ \triangleq&\left[\dfrac{\partial (f\circ A)}{\partial b_{kl}}(B)\right]\\ =&\left[\sum\limits_{i,j=1}^n\dfrac{\partial f}{\partial a_{ij}}\left(A(B)\right)\dfrac{\partial a_{ij}}{\partial b_{kl}}(B)\right]\\ =&\left[\sum\limits_{i,j=1}^ng_{ij}\left(A(B)\right)\dfrac{\partial a_{ij}}{\partial b_{kl}}(B)\right]\\ =&\left[\sum\limits_{i,j=1}^ng_{ij}\left(C'BC\right)c_{ki}c_{lj}\right]\\ =&\left[c_kg(C'BC)c_l'\right]\\ =&Cg(C'BC)C' \end{align}

where $c_k$ denotes the $k$-th line of $C$

0
On

Denote the known gradient as the matrix $G$ and use it to write the differential of the function. $$\eqalign{ G &= \frac{\partial f}{\partial A} \doteq g(A) \\ df &= G:dA \\ }$$ Now use the relationship $\,A=C^TBC\,$ to change the independent variable ( from $A\to B\,$) $$\eqalign{ df &= G:d(C^TBC) \\ &= G:C^TdB\,C \\ &= CGC^T:dB \\ \frac{\partial f}{\partial B} &= CGC^T \\\\ }$$ In the above, a colon denotes the trace/Frobenius product, i.e. $$\eqalign{ M:N = {\rm Tr}(M^TN) = {\rm Tr}(N^TM) = N:M \\ }$$