Derivative of scalar function $\rm v^\top A^n v$ w.r.t. matrix $\rm A$

84 Views Asked by At

I would like to derivative of a scalar function with respect to matrix. In pariticular,

For given vector $v \in \mathbb{R}^n$, let $f(A)=v^\top A^n v$ for any integer $n\in\mathbb{N}$. I want to find $\nabla_A (v^\top A^n v)$

When $n=1$, $$\nabla_A (v^\top A v) = v^\top v$$ and for $n=2$, $$\nabla_A(v^\top A^2 v) = Avv^\top + vv^\top A^\top$$ But I don't know how to obtain derivative for general $n$. Is there general form of derivative?

2

There are 2 best solutions below

1
On

For convenience, define a new symmetric matrix variable $$V = vv^T = V^T$$

Then the function can be written in terms of the Frobenius (:) Inner Product and this new variable. And its differential and gradient can be derived as $$\eqalign{ \cr f &= V:A^n \cr\cr df &= V:dA^n \cr &= V:\sum_{k=0}^{n-1} A^k\,dA\,A^{n-1-k} \cr &= \sum_{k=0}^{n-1} \Big(A^k\,V\,A^{n-1-k}\Big)^T : dA \cr\cr \frac{\partial f}{\partial A} &= \sum_{k=0}^{n-1} \Big(A^k\,V\,A^{n-1-k}\Big)^T \cr\cr }$$ Depending on the layout convention used, the gradient might be the transpose of this expression.

0
On

Let

$$f (\mathrm X) := \mathrm a^{\top} \mathrm X^n \, \mathrm a = \mbox{tr} (\mathrm a^{\top} \mathrm X^n \, \mathrm a) = \mbox{tr} (\mathrm a \mathrm a^{\top} \mathrm X^n)$$

Since

$$(\mathrm X + h \mathrm M)^n = \mathrm X^n + h \left( \sum_{k=0}^{n-1} \mathrm X^k \mathrm M \mathrm X^{n-1-k} \right) + O (h^2)$$

then the directional derivative of $f$ in the direction of $\mathrm M$ at $\mathrm X$ is

$$D_{\mathrm M} \, f (\mathrm X) = \mbox{tr} \left( \sum_{k=0}^{n-1} \mathrm a \mathrm a^{\top} \mathrm X^k \mathrm M \mathrm X^{n-1-k} \right) = \mbox{tr} \left( \left( \sum_{k=0}^{n-1} \mathrm X^{n-1-k} \mathrm a \mathrm a^{\top} \mathrm X^k \right) \mathrm M \right)$$

Thus, the gradient is

$$\nabla_{\mathrm X} f (\mathrm X) = \left( \sum_{k=0}^{n-1} \mathrm X^{n-1-k} \mathrm a \mathrm a^{\top} \mathrm X^k \right)^{\top}$$