Expectation of $x$ under multivariate Gaussian distribution

402 Views Asked by At

I am reading the proof of $\mathbb{E}[x]=\mathbf{\mu}$ for a Gausssian Multivariate distribution from PRML by Bishop (2006 edition) given at page 82 chapter 2. I am not able to derive the steps after equation (2.58) to reach equation (2.59). Specifically, equation (2.58) given as:

$\mathbb{E}[x] = \frac{1}{(2\pi)^{D/2}}\frac{1}{|\mathbf{\Sigma}|^{1/2}} \int exp\left\{-\frac{1}{2} \mathbf{z}^{T} \mathbf{\Sigma}^{-1} \mathbf{z} \right\}(\mathbf{z} + \mathbf{\mu})d\mathbf{z}$

I can see that the exponential function is an even function, but how does the author claim that the term in $\mathbf{z}$ in the factor $(\mathbf{z} + \mathbf{\mu})$ will vanish by symmetry? I maybe over looking at a very obvious point, but I am stuck at this for quite some time. Any direction would be helpful.

2

There are 2 best solutions below

0
On

If you expand the equation, you get:

$$ \begin{align} \mathbb{E}[x] &= \frac{1}{(2\pi)^{D/2}} \frac{1}{|\Sigma|^{1/2}} \int \exp \big( -\frac{1}{2} Z^T \Sigma^{-1} Z \big) (Z + \mu) dZ \\ &= \frac{1}{(2\pi)^{D/2}} \frac{1}{|\Sigma|^{1/2}} \int Z \exp \big(-\frac{1}{2} Z^T \Sigma^{-1} Z \big) + \mu \exp \big(-\frac{1}{2} Z^T \Sigma^{-1} Z \big) dZ \end{align} $$

If you look at the expression $f(Z) = Z \exp \big(-\frac{1}{2} Z^T \Sigma^{-1} Z \big)$, we can see that this is an odd function because

$$ \begin{align} f(-Z) &= (-Z) \exp \big( -\frac{1}{2} (-Z^T) (\Sigma^{-1})(-Z) \big) \\ &= -Z \exp \big(-\frac{1}{2} Z^T \Sigma^{-1} Z \big) \\ &= - f(Z) \end{align} $$

Therefore the integrand of that expression will be zero because it is an odd function. I believe that this is how the author claims that the $Z$ in the factor $(Z + \mu)$ will vanish by symmetry.

0
On

Let $X = (X_{1},\dots,X_{n})$ denote a multivariate Gaussian random vector, and consider its expectation $\mathbb{E}[X] = (\mathbb{E}[X_{1}],\dots,\mathbb{E}[X_{n}]).$ Accordingly, we need to compute $\mathbb{E}[X_{i}]$ for every $i=1,...,n.$ Now, the right-hand side of $$\mathbb{E}[X_{i}] = \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} x_{i} \frac{1}{(2 \pi)^{n/2} |\Sigma|^{1/2}} \exp(-\frac{1}{2} (x - \mu)^{t} \Sigma^{-1} (x-\mu)) \, dx_{1} \cdots dx_{n}$$ becomes $$\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} (y_{i} + \mu_{i}) \frac{1}{(2 \pi)^{n/2} |\Sigma|^{1/2}} \exp(-\frac{1}{2} y^{t} \Sigma^{-1} y) \, dy_{1} \cdots dy_{n}$$ after the change of variables $y = x - \mu.$ Next, we split the integral into a sum of integrals and obtain $$\mu_{i} + \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} y_{i} \frac{1}{(2 \pi)^{n/2} |\Sigma|^{1/2}} \exp(-\frac{1}{2} y^{t} \Sigma^{-1} y) \, dy_{1} \cdots dy_{n}.$$ Then, we focus on the intergal $$\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} y_{i} \exp(-\frac{1}{2} y^{t} \Sigma^{-1} y) \, dy_{1} \cdots dy_{n},$$ where we use the spectral theorem to rewrite $\Sigma^{-1} = Q \Lambda^{-1} Q^{t}$ such that the integral of focus becomes $$\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} y_{i} \exp(-\frac{1}{2} (Q^{t}y)^{t} \Lambda^{-1} Q^{t} y) \, dy_{1} \cdots dy_{n}.$$ In turn, this enables us to rewrite the intergal as $$\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} \sum_{j=1}^{n} Q_{ij} z_{j} \exp(-\frac{1}{2} z^{t} \Lambda^{-1} z) \, dz_{1} \cdots dz_{n} = \int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} \sum_{j=1}^{n} Q_{ij} z_{j} \exp(-\frac{1}{2} \sum_{k=1}^{n} \frac{z_{k}^{2}}{\lambda_{k}}) \, dz_{1} \cdots dz_{n},$$ when we use the change of variables $z = Q^{t}y$ and let $\lambda_{k}$ denote the $k$th eigenvalue of $\Sigma.$ Lastly, we see that the innermost integral vanishes $$\int_{-\infty}^{\infty} \sum_{j=1}^{n} Q_{ij} z_{j} \exp(-\frac{1}{2} \frac{z_{1}^{2}}{\lambda_{1}}) \, dz_{1} = 0,$$ since the integrand is odd, which implies that the entire integral $$\int_{-\infty}^{\infty} \cdots \int_{-\infty}^{\infty} y_{i} \frac{1}{(2 \pi)^{n/2} |\Sigma|^{1/2}} \exp(-\frac{1}{2} y^{t} \Sigma^{-1} y) \, dy_{1} \cdots dy_{n} = 0.$$ Hence, we have that $\mathbb{E}[X_{i}] = \mu_{i}$ for every $i=1,...,n,$ and therefore that $\mathbb{E}[X] = \mu.$