Matrix derivative of Frobenius norm with Hadamard product inside

799 Views Asked by At

I have to find the derivative of $$||R - (P \circ \gamma \gamma^T)||_F^2$$ with respect to $\gamma$, where $||\cdot||_F$ is the Frobenius norm and $\circ$ is the Hadamard (elementwise) product. $R$ and $P$ are of dimension $n \times n$ and $\gamma$ of dimension $n \times r$.

I know that in general, the derivative of $||X||_F^2$ w.r.t. $X$ is $2X$, and I've tried to then use the chain rule, but I can't manage to find a coherent answer.

2

There are 2 best solutions below

1
On BEST ANSWER

Let's define a new matrix variable $$M=P\circ\gamma\gamma^T-R$$ And let's use a colon to denote the trace/Frobenius product, i.e. $A:B={\rm tr}(A^TB)$.

Rules for rearranging the Frobenius product follow from the cyclic properties of the trace. Also note that the Frobenius and Hadamard products are mutually commutative $$\eqalign{ A:B &= B:A \cr A\circ B &= B\circ A \cr A:B\circ C &= A\circ B:C \cr\cr }$$

Now we're ready to write down the function, differential, and gradient $$\eqalign{ f &= \|M\|_F^2 = M:M \cr \cr df &= 2M:dM = 2M:P\circ d(\gamma\gamma^T) \cr &= 2P\circ M:(d\gamma\,\gamma^T+\gamma\,d\gamma^T) \cr &= 2(P\circ M+P^T\circ M^T):d\gamma\,\gamma^T \cr &= 2(P\circ M+P^T\circ M^T)\gamma:d\gamma \cr \cr \frac{\partial f}{\partial \gamma} &= 2(P\circ M+P^T\circ M^T)\gamma \cr\cr }$$

0
On

I'm working on:

$$ f \left( x \right) = \frac{1}{2} \left\| P \circ x {x}^{T} - R \right\|_{F}^{2} $$

Defining $ X = x {x}^{T} $ we can first differentiate with respect to $ X $ and then multiply by the derivative of $ X $ with respect to $ x $:

$$ \frac{d}{d {x}_{k}} f \left( x \right) = \frac{ d }{d X} \frac{1}{2} \left\| P \circ X - R \right\|_{F}^{2} \frac{d}{d {x}_{k}} X $$

Now, the first is easy:

$$ \frac{ d }{d X} \frac{1}{2} \left\| P \circ X - R \right\|_{F}^{2} = P \circ \left( P \circ X - R \right) $$

The second is tricky but if you understand the operation it becomes:

$$ \frac{d}{d {x}_{k}} X = \sum_{i} {X}_{i, k} + \sum_{j} {X}_{k, j} $$

Hence:

$$ \frac{d}{d {x}_{k}} f \left( x \right) = \sum_{i} {P}_{i, k} \left( {P}_{i, k} {X}_{i, k} - {R}_{i, k} \right) {x}_{i} + \sum_{j} {P}_{k, j} \left( {P}_{k, j} {X}_{k, j} - {R}_{k, j} \right) {x}_{j} $$

I validated it using Finite Differences (Numerical Derivative) in MATLAB:

vX          = randn([dimOrder, 1]);
hNormFun    = @(vX) 0.5 * (norm( mP .* (vX * vX.') - mR, 'fro' ) ^ 2);

vGNumerical = CalcFunGrad(vX, hNormFun, difMode, epsVal);

mX          = vX * vX.';
vGAnalytic  = zeros([dimOrder, 1]);

for kk = 1:dimOrder
    for ii = 1:dimOrder
        vGAnalytic(kk) = vGAnalytic(kk) + mP(ii, kk) * (mP(ii, kk) .* mX(ii, kk) - mR(ii, kk)) * vX(ii);
    end
    for jj = 1:dimOrder
        vGAnalytic(kk) = vGAnalytic(kk) + mP(kk, jj) * (mP(kk, jj) .* mX(kk, jj) - mR(kk, jj)) * vX(jj);
    end
end

disp(['Maximum Deviation Between Analytic and Numerical Derivative - ', num2str( max(abs(vGNumerical - vGAnalytic)) )]);

Maximum Deviation Between Analytic and Numerical Derivative - 4.5832e-08

The full code can be found in my StackExchnage Mathematics Q2444284 GitHub Repository.