I am not sure how to start with computing the gradient $\frac{\partial L}{\partial X}$ of the following function:
\begin{align} L = \| Y - X \|_F^2 + \sum_i^I u_i \left( \| A_i X_i \|_2^2 - \alpha_i \right) + \sum_k^K v_k \left( {\rm tr} \left( (X^* B_k X^T) \right) - \beta_k \right) \end{align} where
- $Y \in \mathbb{C}^{m \times n}$, i.e., complex-valued matrix,
- $X \in \mathbb{C}^{m \times n}$,
- $X^*$ denotes complex conjugate only, $X^T$ corresponds to transpose of the matrix $X$,
- $X_i \in \mathbb{C}^{m \times 1}$ denotes $i$th column vector of $X$ matrix,
- $A_i \in \mathbb{C}^{p \times m}$ is given,
- $B_k \in \mathbb{C}^{n \times n}$ is given,
- $u_i, \alpha_i, v_k, \beta_k \in \mathbb{R}$ are given.
I thought if I could write the second part in matrix form, then probably I can move forward and try to compute the gradient. But I fail to do that. Your suggestions and help will be highly appreciated.
Work on it one piece at a time.
The first piece $$\eqalign{K &= \|X-Y\|_F^2 = (X-Y)^*:(X-Y) \cr dK &= (X-Y)^*:dX}$$ The second piece. $$\eqalign{M &=\|AXe\|_F^2 =(AXe)^*:(AXe)\cr dM &=(AXe)^*:A\,dX\,e =A^TA^*X^*ee^T:dX}$$ And the third. $$\eqalign{ N &={\rm Tr}(X^*BX^T) =X^*B:X\cr dN &=X^*B:dX}$$ Now put it all together, with various summation coefficients (omit the constant terms). $$\eqalign{ L &= K + \sum_iu_iM_i + \sum_kv_kN_k \cr dL &= \Big((X-Y)^* + \sum_iu_iA_i^TA_i^*X^*e_ie_i^T + \sum_kv_kX^*B_k\Big):dX \cr \frac{\partial L}{\partial X} &= X^*-Y^* + \sum_iu_iA_i^TA_i^*X^*e_ie_i^T + \sum_kv_kX^*B_k \cr\cr }$$ In the above derivation, a colon denotes the double-dot product $$A:B = {\rm Tr}(A^TB)$$ Also $X^*$ is treated as being independent of $X$ under differentiation, also known as Wirtinger derivatives or the ${\mathbb {CR}-}$calculus.
And $e_i$ denotes the $i^{th}$ standard basis vector for ${\mathbb R}^{n}$