Derivative of Frobenius norm of tensor product in component-free notation

380 Views Asked by At

I need to take a derivate with respect to $w$ of the following function: $$ f(x,w)=\left\lVert x \otimes w\right\rVert _{F}, $$ where $x \in \mathbb{R^{n}}$, $w \in \mathbb{R^{n}}$, $\otimes$ is a tensor (outer) product and $\left\lVert\right\rVert _{F}$ - Frobenius norm.

I managed to do this calculation in index notation (i.e. in coordinates):

  • $\frac{\partial f}{\partial w_k}= \frac{\partial(\left\lVert x \otimes w\right\rVert _{F})}{\partial(x \otimes w)_{ij}} \cdot \frac{\partial(x \otimes w)_{ij}}{\partial w_k},$ where Einstein summation convention has been used.

  • $\frac{\partial(\left\lVert x \otimes w\right\rVert _{F})}{\partial(x \otimes w)_{ij}} = \frac{1}{\left(\sum_{mn} |(x \otimes w)_{mn}|^2\right)^{1/2}}\cdot (x \otimes w)_{ij}$

  • $\frac{\partial(x \otimes w)_{ij}}{\partial w_k}= \frac{\partial(x_i w_j)}{\partial w_k} = x_i \delta_{jk},$ where $\delta_{jk}$ is Kronecker delta .

All together combined: $$ \frac{\partial f}{\partial w_k}= \frac{1}{\left(\sum_{mn} |a_{mn}|^2\right)^{1/2}}\cdot a_{ij}\cdot x_i \delta_{jk},\quad \mbox{with} \quad a_{ij} \equiv(x \otimes w)_{ij}=x_iw_j. $$

We are left with a vector since there is summation over $i$ and $j$. On the other hand the last term in product has three indices so it corresponds to a tensor in non index notation.

Could someone do this calculation in component-free notation (i.e. without using coordinates = index notation)?

2

There are 2 best solutions below

1
On

We have $$ [f(x,w)]^2 = \operatorname{tr}[(xw^T)(xw^T)^T] = \operatorname{tr}[xw^Twx^T] = \operatorname{tr}[x^Txw^Tw] = \|x\|^2\|w\|^2. $$ Noting that $\frac{\partial}{\partial w}(w^Tw) = 2w^T$, we have $$ \frac{\partial }{\partial w}\sqrt{\|x\|^2 w^Tw} = \|x\| \cdot \frac{\partial }{\partial w} (w^Tw)^{1/2} = \|x\| \cdot \frac 12 \cdot (w^Tw)^{-1/2}\frac{\partial }{\partial w} (w^Tw) \\ = \|x\| \cdot \frac 12 \cdot (w^Tw)^{-1/2}\cdot (2w^T) = \frac{\|x\|}{\|w\|}\cdot w^T. $$


Coordinate free derivation of $\frac{d}{d w}(w^Tw) = 2w$: taking $n(w) = w^Tw$, we have $$ n(w+h) = w^Tw + w^Th + h^Tw + h^Th = n(w) + 2w^Th + \|h\|^2 \\= n(w) + 2w^Th + o(\|h\|). $$ It follows that $\frac{dn}{dw} = 2w^T$.

0
On

The trick is to differentiate the square of the function $$\eqalign{ f^2 &= \|x\otimes w\|^2 \\&= (x\otimes w):(x\otimes w) \\ &= (x:x)\,(w:w)\\ f\,df &= (x:x)\,(w:dw) \\ df &= \frac{1}{f}\|x\|^2w: dw \\ \frac{\partial f}{\partial w} &= \frac{1}{f}\|x\|^2w \\ }$$ In the above, a colon denotes the Trace/Frobenius product, i.e. $$\eqalign{ A:B = \operatorname{Tr}\left(A^TB\right) }$$