Derivative of $(Ax) \otimes y$ with respect to $x$

252 Views Asked by At

Suppose $A$ is an $ n \times n$ matrix and suppose that $x,\, k$ are $n \times 1$ vectors. Also suppose that $k$ is a constant vector.

let $$ y : = \left( Ax \right) \otimes k $$ Note that by $\otimes$ in this context we mean the outer product of two vectors given as $v\otimes u = vu^\top$.

I would like to find $\frac{\partial y}{\partial x}$. By this symbol I mean to find the derivative of each entry of $y$ with respect to each entry of $x$.

I know that,

$$ \mathrm{d}(x \otimes y) = (\mathrm{d}x)\otimes y + x \otimes (\mathrm{d}y) $$

Therefore (at least formally) we would have \begin{align} \frac{\partial y}{\partial x} = A \otimes k \label{A} \end{align}

But then what I don't understand is (if the way I have found the derivative is correct) what is meant by $A \otimes k$?. Does it mean the Kronecker product of $A$ and $k$ in the usual sense?

Can someone please clarify?. Better yet, how does one find this derivative?

EDIT $Ax k^\top$ is an $n \times n$ matrix.

3

There are 3 best solutions below

1
On BEST ANSWER

Vectorization provides a straightforward solution method. $$\eqalign{ {\rm vec}(Y) &= {\rm vec}(Axk^T) \cr y &= (k\otimes A)\,x \cr dy &= (k\otimes A)\,dx \cr \frac{\partial y}{\partial x} &= (k \otimes A) \cr }$$ In index notation this is just greg's answer: $\,\frac{\partial Y_{ij}}{\partial x_m} = A_{im}k_{j}$

1
On

Let $f: \mathbb{R}^n \rightarrow \mathbb{R}, \ f(x) = (Ax)k^T$. Note that $f$ is a linear map. Thus, $df=f$. Meaning the derivative of $f$ in the point $a\in \mathbb{R}^n$ is the linear map $df(a)[h]=f(h)$ for $h\in \mathbb{R}^n$.

0
On

Actually, $y$ is a matrix not a vector, whose components are $$\eqalign{ Y_{ij} &= A_{im}x_{m}k_{j} \cr }$$ Its gradient with respect to $x$ is a 3rd order tensor $$\eqalign{ G_{ijk} &= \frac{\partial Y_{ij}}{\partial x_k} = A_{im}\delta_{mk}k_{j} = A_{ik}k_{j} \cr\cr }$$ Introducing the 4th order isotropic tensor $$\eqalign{ {\mathcal B}_{ijkl} &= \delta_{il}\delta_{jk} \cr }$$ allows this gradient to be written without subscripts $$\eqalign{ G &= A\star k:{\mathcal B} \cr dY &= G\cdot dx\cr }$$ in which the symbol $(\star)$ denotes a tensor product, $(\cdot)$ denotes a single-dot product, and $(:)$ denotes a double-dot product.