Gradient of Kronecker Product Function

1k Views Asked by At

Suppose I have a matrix A and vectors c,b. Then how can I compute this expression: $$ \nabla_c b^T(A \otimes c)b, $$ assuming the multiplication is compatible of course?

I've found this article but im not sure how reliable it is.

2

There are 2 best solutions below

4
On BEST ANSWER

Write the function using the Frobenius (:) Inner Product $$\eqalign{ f &= b^T(A\otimes c)b \cr &= (A\otimes c):bb^T \cr }$$ At this point, we need to factor the $bb^T$ matrix $$\eqalign{ bb^T &= \sum_{k=1}^r Z_k\otimes Y_k \cr }$$ where the $Z_k$ matrices have the same shape as $A$, and $Y_k$ the same shape as $c$.

Look for the classic paper "Approximation with Kronecker Products" by van Loan and Pitsianis, or Pitsianis' 1997 dissertation (which contains Matlab code).

Substitute the factorization, then calculate the differential and gradient $$\eqalign{ f &= (A\otimes c) : \sum_{k=1}^r Z_k\otimes Y_k \cr &= \sum_{k=1}^r (Z_k:A) (Y_k:c) \cr\cr df &= \sum_{k=1}^r (Z_k:A)\,Y_k :dc \cr\cr \frac{\partial f}{\partial c} &= \sum_{k=1}^r (A:Z_k)\,Y_k \cr }$$

0
On

@ CSA , why do you want to calculate the gradient, a simple, but complicated to write, tensor, while the calculation of the derivative is so easy and is equally effective? (the knowledge of the derivative is equivalent to the knowledge of the gradient).

Consider theorem 3.1 in your reference paper: let $f:A\in M_{m,n}\rightarrow A^TA$. The derivative is the simple linear application $Df_A:H\in M_{m,n}\rightarrow H^TA+A^TH$; from the previous result, we can derive the gradient of $f$: $\nabla(f)(A)=I\bigotimes A^T+(A^T\bigotimes I)T$ where $T$ is the permutation $H\rightarrow H^T$, that is, why make it simple when you can make it complicated.

In the same way, consider theorem 4.1 in same reference: let $g:A\in M_{m,n}\rightarrow A\bigotimes B$; since $g$ is linear, its derivative is $Dg_A:H\in M_{m,n}\rightarrow H\bigotimes B$. After $2$ pages of calculation, the gradient is presented in a very complicated form; where is the interest ?

Here $p:c\in \mathbb{R}^n\rightarrow b^T(A\bigotimes c)b$ is linear and its derivative is $h\in \mathbb{R}^n\rightarrow b^T(A\bigotimes h)b$, formula of a biblical simplicity.