Expectation of quartic form for multivariate Gaussian

Question

Expectation of quartic form for multivariate Gaussian

1.4k Views Asked by Bumbble Comm At 12 Apr 2026 - 1:30

I am reading the 2012 version The Matrix Cookbook. On Page 43, Section 8.2.4, “Mean of Quartic Forms”:

there is a formula that really confuses me:

$$E[x^TAxx^TBx]=Tr[A\Sigma(B+B^T)\Sigma]+m^T(A+A^T)\Sigma(B+B^T)m+(Tr(A\Sigma)+m^TAm)(Tr(B\Sigma)+m^TBm)$$

I do not check formulae of $E[xx^Txx^T]$, $E[xx^TAxx^T]$, or $E[x^Txx^Tx]$ in Cookbook, as they are the special cases of this one. Note that $x\sim N(m,\Sigma)$.

My question is how to prove it?

The following is my attempt:

Suppose consider the simpler situation: a quadratic form with $A$, $B$ symmetric. I use the same technique appeared on Page 11 of Seber and Lee's book "Linear Regression Analysis" to re-organize these two quadratic forms, then

$E[x^TAxx^TBx]=E\Big\{[(x-m)^TA(x-m)+2m^TA(x-m)+m^TAm)] *[(x-m)^TB(x-m)+2m^TB(x-m)+m^TBm)]\Big\}$

$=E\Big\{(x-m)^TA(x-m)(x-m)^TB(x-m)\Big\}+E\Big\{(x-m)^TA(x-m)m^TBm\Big\}+E\Big\{m^TAm(x-m)^TB(x-m)\Big\}+E\Big\{m^TAmm^TBm\Big\}+2E\Big\{m^TA(x-m)m^TBm\Big\}+2E\Big\{m^TAmm^TB(x-m)\Big\}+2E\Big\{(x-m)^TA(x-m)m^TB(x-m)\Big\}+2E\Big\{m^TA(x-m)(x-m)^TB(x-m)\Big\}+4E\Big\{m^TA(x-m)m^TB(x-m)\Big\}$

in which the fifth and sixth items are zeros and the first four items can be further rearranged such that they equate $(Tr(A\Sigma)+m^TAm)(Tr(B\Sigma)+m^TBm)$ by the fact $E\Big\{(x-m)^TQ(x-m)\Big\}=Tr(Q\Sigma)$

The last item can be reduced if we notice $(scalar)^T=scalar$:

$E\Big\{m^TA(x-m)\cdot m^TB(x-m)\Big\}=E\Big\{m^TA(x-m)\cdot (x-m)^TBm\Big\}=m^TAE\Big\{(x-m)\cdot (x-m)^T\Big\}Bm=m^TA\Sigma Bm$

Consequently, (This is my detailed question, how to show it?) it suffices to show that the seventh and eighth items are equal to the first item in RHS of the initial formula, that is, $2E\Big\{(x-m)^TA(x-m)m^TB(x-m)\Big\}+2E\Big\{m^TA(x-m)(x-m)^TB(x-m)\Big\}=Tr[A\Sigma(B+B^T)\Sigma]=2Tr(A\Sigma B\Sigma)$

I searched it in this website, the link is useful: Expected value using the Kronecker Product

The RHS of my above equation, by the formula $Tr(ABCD)=vec(B^T)^T(A^T\otimes C)vec(D)$ can be re-written as:

$Tr(A\Sigma B\Sigma)=vec(\Sigma)^T(A\otimes B)vec(\Sigma)$

The LHS, by the link, by the fact $x\otimes y=vec(yx^T)$ for two column vectors $x$ and $y$, and by $scalar=Tr(scalar)$, are equivalent to:

$E\Big\{(x-m)^TA(x-m)\cdot m^TB(x-m)\Big\}+E\Big\{m^TA(x-m)\cdot (x-m)^TB(x-m)\Big\}=E\Big\{\Big((x-m)^TA(x-m)\Big)\otimes\Big(m^TB(x-m)\Big)\Big\}+E\Big\{\Big(m^TA(x-m)\Big)\otimes\Big((x-m)^TB(x-m)\Big)\Big\}=E\Big\{\Big((x-m)^T\otimes m^T\Big)\Big(A\otimes B\Big)\Big((x-m)\otimes(x-m)\Big)\Big\}+E\Big\{\Big(m^T\otimes (x-m)^T\Big)\Big(A\otimes B\Big)\Big((x-m)\otimes(x-m)\Big)\Big\}=E\Big\{vec\Big((x-m)m^T+m(x-m)^T\Big)^T\Big(A\otimes B\Big)vec\Big((x-m)(x-m)^T\Big)\Big\}$

In normal distribution assumption, $vec(E[(x-m)(x-m)^T])=vec(\Sigma)$. Therefore it seems to be really near the final equating relation (it ought to be):

$vec(\Sigma)^T(A\otimes B)vec(\Sigma)=E\Big\{vec\Big((x-m)m^T+m(x-m)^T\Big)^T\Big(A\otimes B\Big)vec\Big((x-m)(x-m)^T\Big)\Big\}$

Who can help me prove the above equation? Or are there any mistakes in derivations before?

=============================

Here is my second trial:

Inspired by Seber and Lee's book again, on the same page 11, who can help me expand $E[x^TAxx^TBx]$ into such coordinate forms to get same solution? though I estimate the working time is huge.

A natural question is, based on my first attempt, how to coordinately expand

$2E\Big\{(x-m)^TA(x-m)m^TB(x-m)\Big\}+2E\Big\{m^TA(x-m)(x-m)^TB(x-m)\Big\}=Tr[A\Sigma(B+B^T)\Sigma]=2Tr(A\Sigma B\Sigma)$

from one side then simplified to the other side? It's relatively easy than expanding $E[x^TAxx^TBx]$ for sure, but how to expand $Tr(A\Sigma B\Sigma)$? There are 4 matrices rather than vectors.

==============================

My last question: are there any books which not only collect matrix formulae but provide proofs and solutions as well? I need your recommendation!

Honestly thank you in advance!

Original Q&A

There are 2 best solutions below

Bumbble Comm On 13 Sep 2021 - 7:04

Remark: In my second answer, we evaluate the expectation of product of three quadratic forms.

Problem: Let $x\sim \mathcal{N}(m, \Sigma)$. Let $A, B, C$ be given real symmetric matrices. Find $\mathbb{E}[x^\mathsf{T}Ax x^\mathsf{T}Bx x^\mathsf{T}Cx]$.

Using $$abc = \frac{1}{24}[(a + b + c)^3 - (a + b - c)^3 - (a - b + c)^3 - ( - a + b + c)^3],$$ we have \begin{align*} &\mathbb{E}[x^\mathsf{T}Ax x^\mathsf{T}Bx x^\mathsf{T}Cx]\\[5pt] =\,& \frac{1}{24}\,\mathbb{E}[(x^\mathsf{T}Px)^3] - \frac{1}{24}\,\mathbb{E}[(x^\mathsf{T}Qx)^3] - \frac{1}{24}\,\mathbb{E}[(x^\mathsf{T}Rx)^3] - \frac{1}{24}\,\mathbb{E}[(x^\mathsf{T}Sx)^3] \end{align*} where $P = A + B + C, ~Q = A + B - C, ~R = A - B + C, ~S = -A + B + C$.

We may express $x = m + \Sigma^{1/2}y$ where $y\sim \mathcal{N}(0, I)$. For convenience, denote $C = \Sigma^{1/2}$. Since $\Sigma$ is symmetric, $C$ is also symmetric.

Fact 1: Let $y \sim \mathcal{N}(0, I)$. Let $M$ be a given real symmetric matrix. Then:
(1) $\mathbb{E}[(y^\mathsf{T}My)^2] = 2\mathrm{tr}(M^2) + [\mathrm{tr}(M)]^2$;
(2) $\mathbb{E}[(y^\mathsf{T}My)^3] = 8\mathrm{tr}[M^3] + 6\mathrm{tr}[M^2]\mathrm{tr}[M] + [\mathrm{tr}(M)]^3$;
(3) $\mathbb{E}[(y^\mathsf{T}My)y ] = 0$;
(4) $\mathbb{E}[(y^\mathsf{T}My)^2y ] = 0$.
(The proof is given at the end.)

Using Fact 1, we have \begin{align*} &\mathbb{E}[(x^\mathsf{T}Px)^3]\\ =\,& \mathbb{E}[((m + Cy)^\mathsf{T}P(m + Cy))^3]\\ =\,& \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy + 2m^\mathsf{T}PCy + m^\mathsf{T}Pm)^3]\\ =\,& \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy + m^\mathsf{T}Pm)^3] + \mathbb{E}[(2m^\mathsf{T}PCy)^3]\\ &\quad + \mathbb{E}[3(y^\mathsf{T}C^\mathsf{T}PCy + m^\mathsf{T}Pm)^2\cdot 2m^\mathsf{T}PCy]\\ &\quad + \mathbb{E}[3(y^\mathsf{T}C^\mathsf{T}PCy + m^\mathsf{T}Pm)\cdot(2m^\mathsf{T}PCy)^2]\\ =\, & \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy + m^\mathsf{T}Pm)^3] + \mathbb{E}[3(y^\mathsf{T}C^\mathsf{T}PCy + m^\mathsf{T}Pm)\cdot(2m^\mathsf{T}PCy)^2]\\ =\,& \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy)^3] + (m^\mathsf{T}Pm)^3 + \mathbb{E}[3(y^\mathsf{T}C^\mathsf{T}PCy)^2m^\mathsf{T}Pm]\\ &\quad + \mathbb{E}[3(y^\mathsf{T}C^\mathsf{T}PCy)(m^\mathsf{T}Pm)^2] + \mathbb{E}[12 y^\mathsf{T}C^\mathsf{T}PCy \cdot y^\mathsf{T}C^\mathsf{T}Pmm^\mathsf{T}PCy]\\ &\quad + \mathbb{E}[12 m^\mathsf{T}Pm\cdot y^\mathsf{T}C^\mathsf{T}Pmm^\mathsf{T}PCy]. \end{align*} The left is smooth.

Proof of Fact 1: Let $M = U\mathrm{diag}(d_1,d_2, \cdots, d_n)U^T$ be the eigendecomposition. Let $z = U^Ty$. Then $z$ is a random vector with i.i.d. entries distributed as $\mathcal{N}(0, 1)$. Note that $\mathbb{E}[z_i^6] = 15$, $\mathbb{E}[z_i^4] = 3$ and $\mathbb{E}[z_i^2]=1$. We have \begin{align} \mathbb{E}[(y^TMy)^3] &= \mathbb{E}[(z^T\mathrm{diag}(d_1,d_2, \cdots, d_n)z)^3]\\ &= 6\sum d_i^3 + 9(\sum d_i)(\sum d_i^2) + 6\sum_{i < j < k} d_i d_j d_k\\ &= 8\mathrm{tr}[M^3] + 6\mathrm{tr}[M^2]\mathrm{tr}[M] + [\mathrm{tr}(M)]^3 \end{align} where we have used the Newton identity $$\sum_{i < j < k} d_i d_j d_k = \frac{1}{6}(\sum d_i)^3 - \frac{1}{2}(\sum d_i)(\sum d_i^2) + \frac{1}{3}\sum d_i^3.$$

**Bumbble Comm** · Accepted Answer

Suppose that $A, B$ are real matrices (not necessarily symmetric).

We may express $x = m + \Sigma^{1/2}y$ where $y\sim \mathcal{N}(0, I)$.

For convenience, denote $C = \Sigma^{1/2}$. Note that $C$ is real symmetric.

Using $4ab = (a + b)^2 - (a - b)^2$, we have \begin{align*} \mathbb{E}[x^\mathsf{T}Ax x^\mathsf{T}Bx] &= \frac{1}{4}\,\mathbb{E}[(x^\mathsf{T}(A + B)x)^2] - \frac{1}{4}\,\mathbb{E}[(x^\mathsf{T}(A - B)x)^2]\\ &= \frac{1}{4}\,\mathbb{E}[(x^\mathsf{T}Px)^2] - \frac{1}{4}\,\mathbb{E}[(x^\mathsf{T}Qx)^2] \end{align*} where $P = \frac{A + A^\mathrm{T}}{2} + \frac{B + B^\mathsf{T}}{2}$ and $Q = \frac{A + A^\mathrm{T}}{2} - \frac{B + B^\mathsf{T}}{2}$, and we have used $x^\mathsf{T}Ax = x^\mathsf{T}\frac{A + A^\mathsf{T}}{2}x$.
Note: $P, Q$ are both symmetric.

Fact 1: Let $y \sim \mathcal{N}(0, I)$. Let $S$ be a given real symmetric matrix. Then:
(1) $\mathbb{E}[(y^\mathsf{T}Sy)^2] = 2\mathrm{tr}(S^2) + [\mathrm{tr}(S)]^2$;
(2) $\mathbb{E}[(y^\mathsf{T}Sy)y] = 0$.
The proof is not difficult. I gave a proof of (1) before. See: Expectation of product of quadratic forms in Gaussian distribution

Using Fact 1, we have \begin{align*} &\mathbb{E}[(x^\mathsf{T}Px)^2]\\ =\, & \mathbb{E}[((m + Cy)^\mathsf{T}P(m + Cy))^2]\\ =\,& \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy + 2m^\mathsf{T}PCy + m^\mathsf{T}Pm)^2]\\ =\, & \mathbb{E}[(y^\mathsf{T}C^\mathsf{T}PCy)^2] + \mathbb{E}[(2m^\mathsf{T}PCy)^2] + (m^\mathsf{T}Pm)^2 + \mathbb{E}[4(m^\mathsf{T}PCy) m^\mathsf{T}Pm]\\ &\quad + \mathbb{E}[4(y^\mathsf{T}C^\mathsf{T}PCy)m^\mathsf{T}PCy] + \mathbb{E}[2(y^\mathsf{T}C^\mathsf{T}PCy)m^\mathsf{T}Pm] \tag{1}\\ =\,& 2\mathrm{tr}[(C^\mathsf{T}PC)^2] + [\mathrm{tr}(C^\mathsf{T}PC)]^2 + 4m^\mathsf{T}PC C^\mathsf{T}Pm + (m^\mathsf{T}Pm)^2\\ &\quad + 2m^\mathsf{T}Pm\,\mathrm{tr}[C^\mathsf{T}PC] \\ =\,& 2\mathrm{tr}[P\Sigma P \Sigma] + [\mathrm{tr}(P\Sigma)]^2 + 4m^\mathsf{T}P\Sigma P m + (m^\mathsf{T}Pm)^2 + 2m^\mathsf{T}Pm\,\mathrm{tr}[P\Sigma] \end{align*} where in (1) we have used $y^\mathsf{T}C^\mathsf{T}PCy = \mathrm{tr}[C^\mathsf{T}PCyy^\mathsf{T}]$, and $\mathbb{E}[yy^\mathsf{T}] = I$.

Thus, we have \begin{align*} &\mathbb{E}[x^\mathsf{T}Ax x^\mathsf{T}Bx]\\ =\,& \frac12 \mathrm{tr}[P\Sigma P \Sigma] - \frac12 \mathrm{tr}[Q\Sigma Q \Sigma] + \frac14 [\mathrm{tr}(P\Sigma)]^2 - \frac14 [\mathrm{tr}(Q\Sigma)]^2\\ &\quad + m^\mathsf{T}P\Sigma P^\mathsf{T}m - m^\mathsf{T}Q\Sigma Q^\mathsf{T}m + \frac14 (m^\mathsf{T}Pm)^2 - \frac14 (m^\mathsf{T}Qm)^2\\ &\quad + \frac12 m^\mathsf{T}Pm\,\mathrm{tr}[P\Sigma] - \frac12 m^\mathsf{T}Qm\,\mathrm{tr}[Q\Sigma]\\ =\,& \frac12\mathrm{tr}[(A + A^\mathsf{T})\Sigma (B + B^\mathsf{T})\Sigma] + \frac14\, \mathrm{tr}[(A + A^\mathsf{T})\Sigma]\mathrm{tr}[(B + B^\mathsf{T})\Sigma]\\ &\quad + m^\mathsf{T}(A + A^\mathsf{T})\Sigma (B + B^\mathsf{T})m + \frac14\, m^\mathsf{T}(A + A^\mathsf{T})m m^\mathsf{T}(B + B^\mathsf{T})m\\ &\quad + \frac14\, m^\mathsf{T}(A + A^\mathsf{T})m \, \mathrm{tr}[(B + B^\mathsf{T})\Sigma] + \frac14\, m^\mathsf{T}(B + B^\mathsf{T})m \, \mathrm{tr}[(A + A^\mathsf{T})\Sigma]\\ =\,& \mathrm{tr}[A\Sigma (B + B^\mathsf{T})\Sigma] + m^\mathsf{T}(A + A^\mathsf{T})\Sigma (B + B^\mathsf{T})m\\ &\quad + (\mathrm{tr}[A\Sigma] + m^\mathsf{T}Am)(\mathrm{tr}[B\Sigma] + m^\mathsf{T}Bm). \end{align*}

We are done.

Expectation of quartic form for multivariate Gaussian

There are 2 best solutions below

Related Questions in PROBABILITY

Related Questions in MATRICES

Related Questions in STATISTICS

Related Questions in NORMAL-DISTRIBUTION

Trending Questions

Popular # Hahtags

Popular Questions