I answered this question, but I'd like to understand more details about the matrix notation behind it (and that's why I'm making another post). We have $f:\Bbb R^n\to \Bbb R$ given by $$f(\theta) \doteq \alpha e^{-\beta \theta^\top\theta}, $$alright. We want to compute the bilinear map ${\rm Hess} f (\theta)$. Since I recognize $g (\theta)\doteq\theta^\top \theta$ as $\langle \theta,\theta\rangle$ (of course, $\langle \cdot,\cdot\rangle$ denotes the usual scalar product), I see that $$Dg(\theta) = 2\langle \theta, \cdot \rangle = 2\theta^\top, $$and hence $\nabla g (\theta) = 2\theta $. Then chain rule gives $$\nabla f (\theta) = -2\alpha \beta e^{-\beta \theta^\top \theta}\theta $$as the OP of the linked question states, so far so good.
I'm having trouble doing something similar to check that $${\rm Hess}f (\theta)=2\alpha \beta e^{-\beta \theta^\top\theta}(2\beta \color{blue}{\theta\theta^\top}-{\rm Id}_n).$$I do not want to use components as I did there.
A simple attempt is to use the product rule together with ${\rm d}\theta ={\rm Id}_n $. Differentiating the expression for $\nabla f (\theta) $ we get $$-2\alpha\beta (e^{-\beta\theta^\top\theta}(-2\beta \theta^\top)\theta +e^{-\beta \theta^\top\theta}{\rm Id}_n) = 2\alpha \beta e^{-\beta \theta^\top\theta}(2\beta\color{red}{\theta^\top\theta}-{\rm Id}_n), $$but this doesn't compile, and I can't see why the order comes out wrong.
So I'd like to know exactly what identification am I missing here. I also recognize $\theta\theta^\top$ as the matrix of the bilinear map $\theta \otimes \theta$, and I'm comfortable with tensor products, so you can come in with guns blazing, if needed.
Thanks.
Although you've already used $g$, I'd like to use it to denote the gradient, i.e. $\,\,g=\nabla f$
Find the differential of the gradient, then the hessian $$\eqalign{ g &= -2\beta f\theta \cr dg &= -2\beta(\theta\,df+f\,d\theta) \cr &= -2\beta(\theta g^Td\theta+fI\,d\theta) \cr H=\frac{\partial g}{\partial\theta} &= -2\beta\,(\theta g^T+fI) \cr &= 2\beta\,\,\big(\theta(2\beta f\theta)^T-fI\big) \cr &= 2f\beta\,\,(2\beta\,\theta\theta^T-I) \cr }$$ As expected, this is your result but with the change $$\theta^T\theta \implies \theta\theta^T$$