Is the derivative of $\pmb x^\top (I+GB)^{-1} (I+GB)^{-\top}\pmb x$ with respect to $B$ a 4th-order tensor?

38 Views Asked by At

Is the derivative of $\pmb x^\top (I+GB)^{-\top} (I+GB)^{-1}\pmb x$ with respect to $B$ a 4th-order tensor?

Where $\pmb x$ is a vector, and $B$ is a matrix.

I followed the procedures in What is the derivative of $x^T A A^T x$ with respect to $A$? and arrived at $2\pmb x\pmb x^\top(I+GB)^{-1} : -(I+GB)^{-\top}⊗(I+GB)G$, with the ":" sign denoting the Frobenious product. But here is the problem: the function is from $\mathbb{R}^{n\times n}$ to $\mathbb{R}$, which implies that the derivative w.r.t. $B$ should be a matrix instead of a tensor!

Can anybody help me with this? Thanks in advance!

2

There are 2 best solutions below

0
On

What you did was right: The function is indeed from $\Bbb R^{n×n}$ to $\Bbb R$, and the derivative with respect to $B$ is indeed a matrix. The multiplication by $\pmb x$ and $\pmb x^\top$, in each case, effectively reduces the dimension of the tensor by $1$. Also, note that a matrix is a two-dimensional tensor.

0
On

$ \def\LR#1{\left(#1\right)} \def\op#1{\operatorname{#1}} \def\sym#1{\op{sym}\LR{#1}} \def\trace#1{\op{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\p{\partial}\def\grad#1#2{\frac{\p #1}{\p #2}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} $Define a new matrix variable $$\eqalign{ A=\LR{I+GB}^{-1} \qiq dA = -A\LR{G\,dB}A \\ }$$ Calculate the differential and gradient of the function $$\eqalign{ \phi &= x^TA^TAx \\ &= xx^T:A^TA \\ d\phi &= xx^T:2\sym{A^TdA} \\ &= 2Axx^T:dA \\ &= -2Axx^T:A\LR{G\,dB}A \\ &= -2G^TA^TAxx^TA^T:dB \\ \grad{\phi}{B} &= -2G^TA^TAxx^TA^T \\ }$$ So the gradient is a matrix, not a tensor.

In this derivation I have utilized the matrix inner product, i.e. $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \|A\|^2_F \\ }$$ and the sym() operator $$\sym X = \frac{X+X^T}2$$