I have the following loss function to minimize : $\hat{\mathbf{A}} = \arg \min_{\mathbf{A}} \frac{1}{2}{\parallel{\mathbf{Y} - \mathbf{K} \left(\left( \mathbf{D}\mathbf{A}\right)\odot\mathbf{M}\right)}\parallel}_{F}^{2} + \frac{1}{2}{\parallel{\mathbf{W} - \mathbf{L}\left(\left( \mathbf{D}\mathbf{A}\right)\odot\mathbf{H}\right) \mathbf{S}}\parallel}_{F}^{2}$
Where ${\parallel\mathbf{X}\parallel}_{F}^{2} =Tr(\mathbf{XX^T})$ is the Frobenius norm, and $\odot$ denote Hadamard product (element-wise product).
For the first term : $\mathbf{Y}\in\mathbb{C}^{a\times l}$, $\mathbf{K}\in\mathbb{C}^{a\times b}$, $\mathbf{D}\in\mathbb{C}^{b\times d}$, $\mathbf{A}\in\mathbb{C}^{d\times l}$ and $\mathbf{M}\in\mathbb{C}^{b\times l}$.
For the second term : $\mathbf{W}\in\mathbb{C}^{b\times c}$, $\mathbf{L}\in\mathbb{C}^{b\times b}$, $\mathbf{H}\in\mathbb{C}^{b\times l}$ and $\mathbf{S}\in\mathbb{C}^{l\times c}$.
I know when the loss function is on the form :
$J(X) = \frac{1}{2}{\parallel{\mathbf{Y} - \mathbf{K} \left( \mathbf{X}\odot\mathbf{M}\right)}\parallel}_{F}^{2}$
then its derivative is :
$\nabla J(X) = K^{H}\left( \mathbf{Y} - \mathbf{K} \left( \mathbf{X}\odot\mathbf{M}\right)\right)\odot \mathbf{M}$
But in this case it is so difficult. Is there any way to compute its gradiant ?
$ \def\a{\alpha}\def\b{\beta}\def\g{\gamma}\def\l{\lambda} \def\o{{\tt1}}\def\p{\partial} \def\LR#1{\left(#1\right)} \def\BR#1{\Big(#1\Big)} \def\bR#1{\big(#1\big)} \def\FF#1{\left\|#1\right\|_F^2} \def\Diag#1{\operatorname{Diag}\LR{#1}} \def\trace#1{\operatorname{Tr}\LR{#1}} \def\qiq{\quad\implies\quad} \def\grad#1#2{\frac{\p #1}{\p #2}} \def\c#1{\color{red}{#1}} \def\CLR#1{\c{\LR{#1}}} \def\gradLR#1#2{\LR{\grad{#1}{#2}}} $For typing convenience, define the matrix variables $$\eqalign{ B &= {H\odot\LR{DA}} &\qiq dB = {H\odot\LR{D\;dA}} \\ C &= {LBS-W} &\qiq dC = L\:dB\:S\\ }$$ and introduce the Frobenius product, which is a concise notation for the trace $$\eqalign{ A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\ A:A &= \|A\|^2_F \\ }$$ The properties of the underlying trace function allow the terms in a Frobenius product to be rearranged in many different ways, e.g. $$\eqalign{ A:B &= B:A \\ A:B &= A^T:B^T \\ C:\LR{AB} &= \LR{CB^T}:A \\&= \LR{A^TC}:B \\ }$$ Finally, the Frobenius and Hadamard products commute $$\eqalign{ A:\LR{B\odot C} \,=\, \LR{A\odot B}:C \:=\: \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij}C_{ij} \\ \\ }$$
Use the above notation to rewrite the last term in your loss function. Then calculate its differential and gradient. $$\eqalign{ \a &= \tfrac 12\:{C:C} \\ d\a &= C:dC \\ &= C:\LR{L\:dB\:S} \\ &= \LR{L^TCS^T}:dB \\ &= \LR{L^TCS^T}:\LR{H\odot\LR{D\;dA}} \\ &= H\odot\LR{L^TCS^T}:\LR{D\;dA} \\ &= D^T\LR{H\odot\LR{L^TCS^T}}:dA \\ \grad{\a}{A} &= D^T\LR{H\odot\LR{L^T\c{C}S^T}} \\ }$$ or, in terms of the original variables $$\eqalign{ \grad{\a}{A} &= D^T\LR{H\odot\LR{L^T\CLR{L\LR{H\odot\LR{DA}}S-W}S^T}} \\ }$$ The calculation for the other half of the loss function is analogous, after substituting variables $$\eqalign{ &\LR{\a,W,L,H,S} \to \LR{\b,Y,K,M,I} \\ &\grad{\b}{A} = D^T\LR{M\odot\LR{K^T\CLR{K\LR{M\odot\LR{DA}}-Y}}} \\ }$$ Since $S$ was replaced by an identity matrix, it has been completely omitted.
The full loss function is therefore $$\eqalign{ \l &= \a + \b \qiq \grad{\l}{A} &= \grad{\a}{A} + \grad{\b}{A} \\\\ }$$
Update
The above derivation mistakenly used real matrices. Complex matrices necessitate the use of the ${\mathbb{CR}}$-calculus, also known as Wirtinger derivatives. The basic idea is to treat a variable and its complex conjugate as two formally independent variables.
Here is a simple example. $\,$ Let $X\in{\mathbb C}^{m\times n}\:$ and $X^*$ denote its complex conjugate, then $$\eqalign{ \phi &= \FF{X} \;=\; X:X^* \\ d\phi &= X^*:dX \;+\; X:dX^* \\ \grad{\phi}{X} &= X^*, \qquad \grad{\phi}{X^*} = X \;\equiv\; \gradLR{\phi}{X}^* \\ }$$ Applying these ideas to the above problem yields $$\eqalign{ \grad{\a}{A} &= \frac 12\, D^T\LR{H\odot\LR{L^T\CLR{L\LR{H\odot\LR{DA}}S-W}^{\c*}S^T}} \\ \grad{\b}{A} &= \frac 12\, D^T\LR{M\odot\LR{K^T\CLR{K\LR{M\odot\LR{DA}}-Y}^{\c*}}} \\ }$$