How do I take the derivative of $(A+B \cdot C)^{T}(A+B \cdot C)$ with respect to matrix C?

195 Views Asked by At

Where T is the transpose operator

A is a matrix of shape (10, 1)

B is a matrix of shape (10, 3)

C is a matrix of shape (3, 1)

I am trying to find the derivative of this expression with respect to matrix C using vector calculus. I would like to know how to compute this without reducing it to element by element operations.

I am trying to follow the rules in the matrix cookbook: https://www.math.uwaterloo.ca/~hwolkowi/matrixcookbook.pdf

but I cannot seem to come up with the right answer.

Overall, I use the product rule:

The partial of the first term becomes $B^{T}$. The partial of the second term becomes $B$.

Then we apply the product rule giving us: $B^{T}(A+B\cdot C)+(A+B\cdot C)^{T}B$

But this expression is adding a 3 by 1 matrix to a 1 by 3 matrix.

What am I doing wrong?

Thanks!!

1

There are 1 best solutions below

3
On BEST ANSWER

The easiest way to calculate the derivatives of matrix valued functions is to go back to the definition of derivative in the usual sense of limits, so if $$f(A,B,C) = (A + BC)^{T}(A+BC),$$ then for some small $t > 0$, and some $E \in M_{3\times 1}(\mathbb{R})$, we calculate the directional derivative in the "direction" of $E$: $$\frac{\partial f(A,B,C)}{\partial C}(E) = \lim_{t\rightarrow 0}\frac{f(A,B,C+tE) - f(A,B,C)}{t}\\ = \lim_{t\rightarrow 0}\frac{1}{t}\bigg[ \big(A + B(C+tE)\big)^{T}\big(A+B(C+tE)\big) -(A + BC)^{T}(A+BC)\bigg]\\ \lim_{t\rightarrow 0}\frac{1}{t}\bigg[ (A + BC)^{T}(A+BC) + t(BE)^{T}(A+BC) + t(A+BC)^{T}(BE) + t^{2}(BE)^{T}(BE)\\ - (A+BC)^{T}(A+BC)\bigg]\\ =\lim_{t\rightarrow 0}\bigg[(BE)^{T}(A+BC) + (A+BC)^{T}(BE) + t(BE)^{T}(BE)\bigg]\\ \\ =(BE)^{T}(A+BC) + (A+BC)^{T}(BE), $$ i.e. $$\frac{\partial f(A,B,C)}{\partial C}(E) = (BE)^{T}(A+BC) + (A+BC)^{T}(BE) $$ which is similar to your answer just now the (3,1) matrix $E \in M_{3 \times 1}(\mathbb{R})$ sorts out the dimension problem you were having.