Following notations are used:
$x$: vector-valued random variable $x$ (with its mean $\bar{x}$ = 0).
$P$: covariance of $x$
$tr(\cdot)$: trace operator
$E[\cdot]$: expectation
$M_x(s)$: Moment generating function of $x$
$\nabla_s$: gradient with respect to $s$
This is the statement from my reading:
"~ for a Guassian random vector $x$, the identity $E[x^{\top}Ax] = tr(AP)$ can be derived using the moment generating function $M_x(s) = E[e^{s^{\top}x}] = e^{\frac{1}{2}s^{\top}Ps + s^{\top}\bar{x}}$ and the gradient operator $\nabla_s$".
I follow the partially shown derivation: \begin{align} E[x^{\top}Ax] &= E[(\nabla_s e^{s^{\top}x})^{\top}Ax]|_{s = 0} \qquad \text{(since $x = \nabla_s e^{s^{\top}x}|_{s=0}$)} \\ &= E[\nabla^{\top}_sAx e^{s^{\top}x}]|_{s=0} \quad\qquad \text{(since $e^{s^{\top}x} $ is a scalar)} \\ &= \nabla^{\top}_sAE[xe^{s^{\top}x}]|_{s=0} \quad\qquad \text{(A is constant and $E[\cdot]$ and $\nabla_s$ are linear)} \\ &= \nabla^{\top}_sAE[\nabla_s e^{s^{\top}x}]|_{s=0} \quad\qquad \text{($xe^{s^{\top}x} = \nabla_s e^{s^{\top}x}$)} \\ &= \nabla^{\top}_sA\nabla_sE[ e^{s^{\top}x}]|_{s=0} \quad\qquad \text{($E[\cdot]$ and $\nabla_s$ linear)} \\ &= \nabla^{\top}_sA\nabla_sM_x(s)|_{s=0} \quad\qquad \text{(definition of $M_x(s)$)} \\ \end{align}
But when evaluating the gradient from the last line, I run into a problem of missing the trace operator somewhere below:
\begin{align} E[x^{\top}Ax] &= \nabla^{\top}_sA\nabla_sM_x(s)|_{s=0} \quad\qquad \\ &= \nabla^{\top}_s A \nabla_s \left(e^{\frac{1}{2}s^{\top}Ps} \right)|_{s=0}\quad\qquad \qquad\text{(since $\bar{x} = 0$ )} \\ &= \nabla^{\top}_s A(Ps)e^{\frac{1}{2}s^{\top}Ps}|_{s=0}\qquad\qquad\qquad \text{(chain rule and $P$ symmetric)} \\ &=?\quad A\left[Pe^{\frac{1}{2}s^{\top}Ps} + (Ps)(Ps)^{\top} e^{\frac{1}{2}s^{\top}Ps}\right]|_{s=0}\quad\qquad \text{(product rule)} \\ &=?\quad AP \qquad\qquad\qquad\qquad\qquad\qquad\qquad \text{(evaluating at $s = 0$)} \\ \end{align}
I am wrong but don't know where I am wrong. I suspect the second from the last line. If so, how? Where should a trace operator pop up?
Based on the answer, a complete derivation would be:
\begin{align} E[x^{\top}Ax] &= \nabla^{\top}_sA\nabla_sM_x(s)|_{s=0} \quad\qquad \\ &= \nabla^{\top}_s A \nabla_s \left(e^{\frac{1}{2}s^{\top}Ps} \right)|_{s=0}\quad\qquad \qquad\text{(since $\bar{x} = 0$ )} \\ &= \nabla^{\top}_s A(Ps)e^{\frac{1}{2}s^{\top}Ps}|_{s=0}\qquad\qquad\qquad \text{(chain rule and $P$ symmetric)} \\ &= tr(AP)e^{\frac{1}{2}s^{\top}Ps} + (Ps)^{\top}(APs)e^{\frac{1}{2}s^{\top}Ps}|_{s=0}\quad\qquad \text{(product rule)} \\ &= tr(AP) \qquad\qquad\qquad\qquad\qquad\qquad\qquad \text{(evaluating at $s = 0$)} \\ \end{align}
Let $B$ be a matrix. We have \begin{equation} \nabla_s^\top B s = \sum_i\frac{\partial}{\partial s_i}\sum_j b_{i j} s_j =\sum_i\sum_j b_{i j}\frac{\partial s_j}{\partial s_i} = \sum_i\sum_j b_{i j}\delta_i^j=\text{tr}(B) \end{equation}