I have found a paper on https://arxiv.org/abs/2302.13163 where a modification of the natural gradient descent is proposed. The natural gradient can be computet with (eq. 8) $$ \nabla^EL(\theta) := G^{-1}_E(\theta)\nabla L(\theta). $$ $\theta$ are the parameters of a neural network and L is the loss-function. The formula for the "energy Gram matrix" is (eq. 6) $$ G_E(\theta)_{ij} := D^2E(u_\theta)(\partial_{\theta_i}u_\theta.\partial_{\theta_j}u_\theta) $$ If I understood the paper right $D^2$ is the vector of soft derivatives up to the order 2 and $D^2E(u_\theta)$ is an operator E(u) is the energy function. And in one example it is defines as (eq 16): $$E(u):=\frac{1}{2} \int_{\Omega}\left|u^{\prime}\right|^{2} \mathrm{~d} x+\frac{1}{4} \int_{\Omega} u^{4} \mathrm{~d} x-\int_{\Omega} f u \mathrm{~d} x$$ The given "energy Gram matrix is (one equation after eq. 17) $$D^{2} E(u)(v, w)=\int_{\Omega} v^{\prime} w^{\prime} \mathrm{d} x+3 \int_{\Omega} u^{2} v w \mathrm{~d} x$$
I dont understand how eq. 6 is used to derive the last equation. Thank you for your time