A matrix $W \in \mathbb{R}^{d \times d}$ is the weighted adjacency matrix of a directed acyclic graph (DAG) if and only if
$$ h(W) = \operatorname{tr} \left( \exp(W \circ W ) \right) - d = 0 $$
where $\circ$ is the Hadamard product and $\exp(A)$ is the matrix exponential of $A$. Moreover, $h(W)$ has a simple gradient
$$\nabla h(W) = (e^{W \circ W})^T \circ 2W$$
My proof
To obtain the gradient of the function $h(W)$ with respect to the matrix $W$, we can apply standard matrix calculus techniques.
Take the derivative of $h(W)$ with respect to each element of the matrix $W$. When we compute the derivative of a function with respect to a matrix, such as $\frac{\partial h(W)}{\partial W}$, we need to take into account that each element of the matrix $W$ is a separate variable. In the context of the gradient, we want to find the derivative with respect to the entire matrix $W$, resulting in a matrix of the same size as $W$ representing the gradient. The entry in the gradient matrix corresponding to the element $W_{ij}$ is the partial derivative $\frac{\partial h(W)}{\partial W_{ij}}$.
To compute the derivative of the trace term $\text{tr}(e^{W \circ W})$, I find this reference is useful: \url{http://paulklein.ca/newsite/teaching/matrix%20calculus.pdf}
$\frac{\partial \text{tr}(f(W))}{\partial W} = \frac{\partial \text{tr}(f(W))}{\partial f(W)} \frac{\partial f(W)}{\partial W}$ , where $f(W)$ is a matrix function of $W$. In this case, $f(W) = e^{W \circ W}$.
By using the equation (9) from the above reference, we can see that $ \frac{\partial \text{tr}(f(W))}{\partial f(W)} = I$, where $I$ denotes the identity matrix.
We are left to compute $\frac{\partial f(W)}{\partial W}$.
Let $M = W \circ W$. Express $f(W)$ in terms of $M$: $f(W) = e^M$.
Differentiate $f(W)$ with respect to $M$: $\frac{\partial f(W)}{\partial M} = \frac{\partial e^M}{\partial M}$.
Apply the derivative of the matrix exponential function: $\frac{\partial e^M}{\partial M} = e^M$.
Replace $M$ with $W \circ W$ in the expression: $\frac{\partial f(W)}{\partial M} = e^{W \circ W}$.
Finally, we have $\frac{\partial f(W)}{\partial W} = \frac{\partial f(W)}{\partial M} \cdot \frac{\partial (W \circ W)}{\partial W}$.
Compute the derivative of $(W \circ W)$ with respect to $W$. Since the Hadamard product is element-wise, we have $\frac{\partial (W \circ W)}{\partial W} = 2W$.
Multiply $\frac{\partial f(W)}{\partial M}$ with $\frac{\partial (W \circ W)}{\partial W}$: $\frac{\partial f(W)}{\partial W} = e^{W \circ W} \cdot 2W$.
But in the original paper, the result is $(e^{W \circ W})^T \circ 2W$. Which step that I get wrong?
$$\eqalign{ X &= W\odot W \\ dX &=2W\odot dW \qquad\quad \{{\rm differential}\} \\ E &= e^X \;\,\equiv\;\, I + \sum_{k=1}^\infty \frac1{k!} X^k\\ dE &= \sum_{k=1}^\infty \frac1{k!} \sum_{j=1}^k X^{k-j}\;dX\;X^{j-1} \\ \\ h &= \operatorname{trace}(E) \\ &= I:E \qquad\;\; \{{\rm matrix\;inner\;product}\} \\ dh &= I:dE \\ &= \left(\sum_{k=1}^\infty \frac1{k!} \sum_{j=1}^k X^{k-j}\;I\;X^{j-1}\right)^T:dX \\ &= \left(\sum_{k=1}^\infty \frac1{(k-1)!}X^{k-1}\right)^T:dX \\ &= E^T:dX \\ &= E^T:(2W\odot dW) \\ &= (2E^T\odot W):dW \\ \frac{\partial h}{\partial W} &= 2E^T\odot W \\ }$$
Element-wise exponential $$\eqalign{ E_{ij} &= e^{X_{ij}} \\ dE_{ij} &= E_{ij}\;dX_{ij} \\ dE &= E\odot dX \\ dh &= I:dE \\ &= E:dX \\ &= E:(2W\odot dW) \qquad\qquad\qquad\qquad \\ &= (2E\odot W):dW \\ \frac{\partial h}{\partial W} &= 2E\odot W \\ }$$