For a symmetric matrix $X \in \Bbb R^{n\times n}$, let $$f(X) := u^\top \mbox{diag}(X 1_n) v$$ What is the derivative of $f$ with respect to $X$?
$X 1_n$ is the row-wise summation to generate a vector in $\Bbb R^n$
For a symmetric matrix $X \in \Bbb R^{n\times n}$, let $$f(X) := u^\top \mbox{diag}(X 1_n) v$$ What is the derivative of $f$ with respect to $X$?
$X 1_n$ is the row-wise summation to generate a vector in $\Bbb R^n$
Copyright © 2021 JogjaFile Inc.
$ \def\o{{\tt1}}\def\p{\partial} \def\L{\left}\def\R{\right} \def\LR#1{\L(#1\R)} \def\diag#1{\operatorname{diag}\LR{#1}} \def\Diag#1{\operatorname{Diag}\LR{#1}} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\fracLR#1#2{\LR{\frac{#1}{#2}}} $The Frobenius product is a concise notation for the trace $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \|A\|^2_F \\ }$$ This is also called the double-dot or double contraction product.
When applied to vectors $(n=\o)$ it reduces to the standard dot product.
The properties of the underlying trace function allow the terms in a Frobenius product to be rearranged in many different ways, e.g. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ C:\LR{AB} &= \LR{CB^T}:A \\&= \LR{A^TC}:B \\ }$$ As with the Hadamard product, the matrix on each side of the multiplication symbol $(:)$ must have exactly the same dimensions.
Let's also introduce the
Diag()anddiag()functions. The first transforms its vector argument into a diagonal matrix while the second generates a vector from the main diagonal of its matrix argument.Let $\odot$ denote the Hadamard product, $\o$ the all-ones vector, and note the following identities $$\eqalign{ &\diag{\Diag a} = a = \Diag{a}\,\o \\ &\diag{ab^T} = \LR{a\odot b} = \Diag{a}\,b \\ &\diag{a\o^T} = \LR{a\odot \o} = \Diag{a}\,\o = a \\ \\ }$$
Use the above notation to write the objective function, then calculate its gradient. $$\eqalign{ \phi &= u^T\Diag{X\o}v \\ &= uv^T:\Diag{X\o} \\ &= \diag{uv^T}:{X\o} \\ &= \LR{u\odot v}\o^T:X \\ \grad{\phi}{X} &= \LR{u\odot v}\o^T \\ }$$