Derivative of $BAA^Tx$ with respect to $A$

483 Views Asked by At

Find the derivative of $B A A^T x$ with respect to $A$, where $A, B$ are $n \times n$ matrices and $x$ is a vector.

Clarification: When I say derivative I want to find the derivative of each element of the expression $BAA^Tx$ with respect to each element in $A$. In other words we are trying to find the derivative of a vector with respect to a matrix.

Edit: This question and answer here is relevant.

What I know/tried: (some wishful thinking)

Well I know that I can write $y:=BAA^Tx$ in vectorized form as $y = (x^T \otimes B) \mathrm{vec}(AA^T)$. Then by "chain rule" we should get $$ \frac{\partial y}{\partial A} = (x^T \otimes B)\, \circ\,\frac{\partial({\mathrm{vec}(AA^T)})}{\partial A} $$

Thus, the problem actually boils down to finding the derivative of $AA^T$ with respect to $A$.

How does one go about finding that last derivative?. Am I even on the right track?. Thanks.

1

There are 1 best solutions below

2
On BEST ANSWER

Let vector-valued function $\mathrm f : \mathbb R^{n \times n} \to \mathbb R^n$ be defined by

$$\mathrm f (\mathrm X) := \mathrm A^\top \mathrm X \mathrm X^\top \mathrm b$$

Let $\mathrm a_k$ denote the $k$-th column of $\rm A$. The $k$-th entry of $\mathrm f$ is

$$f_k (\mathrm X) = \mathrm e_k^\top \mathrm A^\top \mathrm X \mathrm X^\top \mathrm b = \mathrm a_k^\top \mathrm X \mathrm X^\top \mathrm b = \mbox{tr} \left( \mathrm X^\top \mathrm b \mathrm a_k^\top \mathrm X \right)$$

Using the directional derivative, after some work we eventually conclude that the gradient of $f_k$ is

$$\nabla f_k (\mathrm X) = \left( \mathrm b \mathrm a_k^\top + \mathrm a_k \mathrm b^\top \right) \mathrm X$$