Matrix product chain rule

328 Views Asked by At

Given that I know $$ \frac{\partial x}{\partial C} $$ Where $$ x = f(C) \\ x \in \mathbb{R} \\ C = AB \\ A \in \mathbb{R}^{m \times n} \\ B \in \mathbb{R}^{n \times p} \\ $$ How do I use the chain rule to compute the following derivatives? $$ \frac{\partial x}{\partial A} \\ \frac{\partial x}{\partial B} $$ I think that $$ \frac{\partial x}{\partial A_{i, j}} = \sum_{k=1}^p B_{j, k} \frac{\partial x}{\partial C_{i, k}} \\ \frac{\partial x}{\partial B_{j, k}} = \sum_{i=1}^m A_{i, j} \frac{\partial x}{\partial C_{i, k}} $$

Is this right? If so, is there a more compact way to write it? It's important that I can write it compactly because it would be too slow in a computer program to not have this vectorized.

1

There are 1 best solutions below

0
On BEST ANSWER

Your sums look reasonable. They appear to work out as

$$\frac{\partial x}{\partial A} = \frac{\partial x}{\partial C} B^{\sf Tr} \qquad \frac{\partial x}{\partial B} = A^{\sf Tr} \frac{\partial x}{\partial C}$$

(which suggests that perhaps you ought to represent the matrices of partial derivatives in transposed form).