When looking at a 2d sde, for processes $Z_t=(X_t,Y_t)^T$.
$dZ_t=f(X_t,Y_t)dt+g(X_t,Y_t)dB_t$
My textbook states that the diffusion is found by calculating $D=1/2\cdot g(X_t,Y_t)g(X_t,Y_t)^T$. For this case, that would result in a $4\times4$ matrix. When later finding the advective term $u = f(X_t,Y_t)-\nabla D$, what is the significance of taking the gradient of that matrix? Or is that not how you would calculate the diffusion for a multivariate sde? I would make sense if the first term in the calculation was transposed instead of the second.