Gradient of an optimization function - Frobenius norm and Hadamard product

192 Views Asked by At

I am trying to solve a problem for my Optimization class, in which it is asked to calculate the gradient of the following function:

$$g(P)=\frac{1}{2}||1_K\circ(R-Q^0P)||_F^2+\frac{\rho}{2}||Q^0||_F^2+\frac{\rho}{2}||P||_F^2$$

where $\rho, R, Q^0$ and $1_K$ are given constants in this case.

But I am totally stuck particularly on the first term (the one that icludes the Frobenius norm and the Hadamard prouct).

I tried to use the definition of the Frobenius norm $||A||_F=\sqrt{Tr(AA^H)}$ but I do not know how to handle it in this situation.

Thank you very much.

1

There are 1 best solutions below

1
On BEST ANSWER

Let's rename the variables using lowercase letters for vectors and uppercase letters for matrices, and omit all of the subscripts and superscripts. $$\eqalign{ p=P,\quad Q=Q^0,\quad r=R,\quad u={\tt1}_K }$$ Let's also define the auxiliary vectors $$\eqalign{ s &= Qp-r \\ w &= u\circ s \\ }$$ We'll also need the trace/Frobenius product $$\eqalign{ A:B &= {\rm Tr}(A^TB) = B:A \\ A:A &= \big\|A\big\|^2_F \\ }$$ Conveniently, the Frobenius product commutes with the Hadamard product, i.e. $$A:(B\circ C) = (A\circ B):C$$ Use the above to rewrite the objective function, then calculate the gradient as follows. $$\eqalign{ g &= \tfrac 12(w:w) + \tfrac \rho2(p:p) + \tfrac \rho2(Q:Q) \\ \\ dg &= (w:dw) + \rho(p:dp) + 0 \\ &= w:(u\circ ds) + \rho(p:dp) \\ &= (u\circ w):ds + \rho(p:dp) \\ &= (u\circ w):Q\,dp + \rho(p:dp) \\ &= Q^T(u\circ w):dp + \rho(p:dp) \\ &= \Big(Q^T(u\circ w)+ \rho p\Big):dp \\ &= \Big(\rho p + Q^T\big(u\circ u\circ(Qp-r)\big)\Big):dp \\ \\ \frac{\partial g}{\partial p} &= \rho p + Q^T\big(u\circ u\circ(Qp-r)\big) \\ }$$