derivative of hadamard product of

716 Views Asked by At

I am having difficulties to compute the derivative of the following expression:

$$ (xx^T)\circ A $$

with respect to $x \in R^K $ where $A \in R^{K\times K}$.

Although Derivative of Hadamard product explains derivatives for the Hadamard product of two matrices, I could not extend it to my expression.

2

There are 2 best solutions below

0
On BEST ANSWER

We define a function as $f: \mathbb{R}^k \to \mathbb{R}^{k \times k}, f(x)=(xx^T)\circ A$.

The domain of the function are vectors in $\mathbb{R}^k$ and the range of the function are matrices in $\mathbb{R}^{k \times k}$.

We want to evaluate the derivative of the function with respect to $x$. For each entries of the matrix, we can calculate the partial derivative of that entry with respect to the the different entries in the vector $x$. The result can therefore be organized into a 3rd order tensor of dimension $k \times k \times k$ that we can denote by $\frac{\partial f(x)}{\partial x}$ where the partial derivative of the $i,j$ entry of $f(x)$ with respect to $k^{th}$ entry in $x$ is denoted by $\frac{\partial f(x)}{\partial x}_{i,j,k}$. We know that the $i,j$ entry of $f(x)_{i,j}=x_ix_ja_{ij}$, so $$\frac{\partial f(x)}{\partial x}_{ijk}=\begin{cases} a_{ij}x_j, s=i\\ a_{ij}x_i, s=j\\ 0 , \text{ otherwise} \end{cases}$$

0
On

$ \def\p{\partial} \def\L{\left}\def\R{\right}\def\LR#1{\L(#1\R)} \def\vec#1{\operatorname{vec}\LR{#1}} \def\diag#1{\operatorname{diag}\LR{#1}} \def\Diag#1{\operatorname{Diag}\LR{#1}} \def\grad#1#2{\frac{\p #1}{\p #2}} $If the matrices in the equation $$\eqalign{ F &= A\circ xx^T \\ }$$ are vectorized and diagonalized $$\eqalign{ f &= \vec{F},\qquad a = \vec{A},\qquad Z = \Diag{a} \\ }$$ then calculating derivatives is easy $$\eqalign{ f &= a\circ\vec{xx^T} \\&= Z\,\vec{xx^T} \\ df &= Z\,\vec{dx\,x^T+x\,dx^T} \\ &= Z\,\Big((x\otimes I)+(I\otimes x)\Big)\,dx \\ \grad{f}{x} &= Z\,\Big((x\otimes I)+(I\otimes x)\Big) \\ }$$ Or, depending on your preferred layout convention, you may want the transpose of this expression. $$\\$$


Another approach is to use element-wise derivatives.
The key derivative which enables this is $$\eqalign{ \grad{x}{x_k} &= e_k \\ }$$ where $e_k$ is a standard cartesian basis vector.

Directly differentiating the equation yields $$\eqalign{ F &= A\circ xx^T \\ \grad{F}{x_k} &= A\circ\LR{e_kx^T+xe_k^T} \\ }$$ Another useful fact is that the $\{e_k\}$ vectors can be distributed over Hadamard products.
This amazing property not shared by any other vector, except (trivially) the zero vector. $$\eqalign{ {\grad{F_{ij}}{x_k}} &= e_i^T\LR{\grad{F}{x_k}}e_j \\ &= {e_i^TAe_j}\circ\LR{e_i^Te_kx^Te_j+e_i^Txe_k^Te_j} \\ &= A_{ij} \LR{\delta_{ik}x_j+\delta_{jk}x_i} \\ }$$