I want to calculate $$ \frac{\partial(A \circ X^\top X)}{\partial(X)}, $$ where $ \circ $ is Hadamard product (elementwise product), $X \in R^{r \times n}$, $A \in R^{n \times n}$, and $\frac{\partial A}{\partial X}=0$.
So far, I found that $$ \frac{\partial(A \circ B)}{\partial(C)} = \frac{\partial(A)}{\partial(C)} \circ B + A \circ \frac{\partial(B)}{\partial(C)},$$ from here (generic rule matrix differentiation (Hadamard Product, element-wise))
In my case, $B=X^\top X$ and $C=X$.
Therefore, $$ \frac{\partial(A \circ X^\top X)}{\partial(X)} = A \circ 2X^\top .$$
However, $A\in R^{n\times n}$ and $X^\top \in R^{n \times r}$.
Therefore, I can't do the Hadamard product.
How can I do this?
First we'll need a few tensors.
A 6th-order tensor ${\mathbb M}$, whose components ${\mathbb M}_{ijklmn}$ are unity if $\,(i=k=m)$ and $(j=l=n),\,$ but zero otherwise.
This tensor makes it possible to rewrite a Hadamard ($\circ$) product using Frobenius (:) products like so $$\eqalign{ A\circ Z &= A:{\mathbb M}:Z \cr\cr }$$
Next we'll need two 4th-order isotropic tensors whose components are $$\eqalign{ {\mathbb E}_{ijkl} &= \delta_{ik}\,\delta_{jl} \cr {\mathbb B}_{ijkl} &= \delta_{il}\,\delta_{jk} \cr }$$ These tensors make it possible to re-arrange matrix products $$\eqalign{ A\,dX\,Z &= A{\mathbb E}Z^T:dX \cr A\,dX^T\,Z &= A{\mathbb E}Z^T:{\mathbb B}:dX \cr\cr }$$
Now we are ready to find the differential and gradient of your function $$\eqalign{ F &= A\circ X^TX \cr &= A:{\mathbb M}:X^TX \cr\cr dF &= A:{\mathbb M}:(dX^TX+X^TdX) \cr &= A:{\mathbb M}:({\mathbb E}X^T:{\mathbb B}:dX+X^T{\mathbb E}:dX) \cr &= A:{\mathbb M}:{\mathbb E}X^T:{\mathbb B}:dX + A:{\mathbb M}:X^T{\mathbb E}:dX \cr &= A:{\mathbb M}:\big({\mathbb E}X^T:{\mathbb B}\,\,+\,\,X^T{\mathbb E}\big):dX \cr\cr \frac{\partial F}{\partial X} &= A:{\mathbb M}:\Big({\mathbb E}X^T:{\mathbb B} \,+\, X^T{\mathbb E}\Big) \cr\cr\cr }$$ Another approach is to use vectorization.
Let $$\eqalign{ f &= \operatorname{vec}(F) \cr x &= \operatorname{vec}(X) \cr a &= \operatorname{vec}(A) \cr {\mathcal A} &= \operatorname{Diag}(a) \cr }$$ Then $$\eqalign{ df &= a\circ\operatorname{vec}(dX^TX+X^TdX) \cr &= {\mathcal A}\,\Big((X^T\otimes I)B\,dx + (I\otimes X^T)\,dx \Big) \cr\cr \frac{\partial f}{\partial x} &= {\mathcal A}\,(X^T\otimes I)B\,+\,{\mathcal A}\,(I\otimes X^T) \cr }$$where $B$ is the Kronecker Commutation matrix.
This is actually quite similar to the tensor result, with $$\eqalign{ {\mathcal A} &\sim A:{\mathbb M} \cr (X^T\otimes I)B &\sim I\,{\mathbb E}\,X^T:{\mathbb B} \cr (I\otimes X^T) &\sim X^T{\mathbb E}\,I \cr }$$