Mean of multivariate normal in measure theoretic sense

309 Views Asked by At

For a random variable $X$ on a probability space $(\Omega,F,P)$ we have it's mean $E[X]$ given by $E[X] = \int_{\Omega} X dP$.

If the random variable has a density function given by the multivariate normal distribution with mean $\mu \in \mathbb{R}^n$ and covariance $Q$ (symbolically $N(\mu,Q)$) how do I transform $E[X] = \int_{\Omega} X dP$ into $\int_{\mathbb{R}^n} xN(\mu,Q)dx$ ?

1

There are 1 best solutions below

0
On BEST ANSWER

The$\newcommand{\II}{1}\newcommand{\RR}{\mathbb{R}}\newcommand{\FF}{\mathcal{F}}$ way I've gone about answering your question involves a bit of a dive into the elementary mechanics of measure theory. Let me preface by giving some intuition. We have the standard way of integrating on $\RR^n$ — namely, with respect to Lebesgue measure $\lambda$. Given that we can integrate all sorts of functions this way, can we develop alternative notions of "integral" on $\RR^n$ using Lebesgue measure as a foundation? In particular, we would like our new integrals to correspond to familiar notions of probability. So what we're really after is a way of integrating with respect to different probability distributions based on Lebesgue measure.

Let's see if we can now flesh this out more technically. Note: Please let me know if there are any parts of this discussion you would like me to clarify, and I'll be happy to explain more.

Measure-theoretic context

Let $(\Omega, \FF, \lambda)$ be a measure space. We present a general technique for generating new probability measures.

Constructing the measure

Let $f : \Omega \to \RR$ be nonnegative (almost surely) such that $$ \int_\Omega f \ d\lambda = 1. $$ Define the set function $P : \FF \to \RR$ by $$ P(E) = \int_\Omega \II_E \cdot f \ d\lambda = \int_E f \ d\lambda, $$ where $\II_E$ is the indicator function of $E$. Now, since $f$ is nonnegative a.s. it follows that $P(E) \geq 0$ for any $E \in \FF$. Moreover, it is not hard to see that $P$ is, in fact, a probability measure:

  1. By definition of $f$, $P(\Omega) = 1$, and as a consequence, $P(\emptyset) = 0$.
  2. Let $(E_n) \in \FF$ be a disjoint sequence of measurable sets, and define $E = \bigcup_{n=1}^{\infty} E_n$. Then using the dominated convergence theorem, we have \begin{align} P(E) &= \int_{\Omega} \II_E \cdot f \ d\lambda \\ &= \int_{\Omega} \left(\sum_{n=1}^{\infty} \II_{E_n}\right) \cdot f \ d\lambda \\ &= \int_{\Omega} \left(\sum_{n=1}^{\infty} \II_{E_n} \cdot f\right) \ d\lambda \\ &= \sum_{n=1}^{\infty} \int_{\Omega} \II_{E_n} \cdot f \ d\lambda \\ &= \sum_{n=1}^{\infty} P(E_n). \end{align}

Integration with respect to the measure

Now, we can apply the "standard machine" argument to show that $\int_{\Omega} g \ d P = \int_{\Omega} g \cdot f \ d\lambda$ for any measurable function $g : \Omega \to \RR$.

  1. If $E \in \FF$, then \begin{align} \int_{\Omega} \II_E \ dP &= P(E) \\ &= \int_{\Omega} \II_E \cdot f \ d\lambda \end{align}
  2. Induction on the above fact implies that $\int_{\Omega} \psi \ d P = \int_{\Omega} \psi \cdot f \ d\lambda$ for any simple function $\psi$.
  3. The dominated convergence theorem on the above fact implies that $\int_{\Omega} g \ dP = \int_{\Omega} g \cdot f \ d\lambda$ for any nonnegative measurable function $g$.
  4. The decomposition $g = g^+ - g^-$ and the above fact imply that $\int_{\Omega} g \ dP = \int_{\Omega} g \cdot f \ d\lambda$ holds for any measurable $g$.

Back to the original question

In our case, we are in $\Omega = \RR^n$. We want to construct the probability measure $P$ corresponding to the multivariate normal distribution $N(\mu, Q)$. I tried to be sneaky in the previous section by hinting at how to proceed. If $\lambda$ is Lebesgue measure on $\RR^n$, and we define $f : \RR^n \to \RR$ by $$ f(x) = \frac{1}{\sqrt{(2\pi)^n \det Q}} \exp\left(-\frac{1}{2} (x - \mu)^T Q^{-1} (x-\mu)\right), $$ then we have

  • $f \geq 0$ for all $x \in \RR^n$,
  • that $$ \int_{\RR^n} f \ d\lambda = 1, $$

and thus there exists a probability measure $P$ on $\RR^n$ such that $$ P(E) = \int_{\RR^n} \II_E \cdot f \ d\lambda = \int_E f(x) \ dx $$ where the right hand side is justified by $f$ being continuous and thus reducing to the familiar Riemann integral.

But of course, $f$ was chosen in a special way to correspond to the multivariate normal random variable $X \sim N(\mu, Q)$. It follows, at last, that \begin{align} E[X] &= \int_{\RR^n} x \ dP \\ &= \int_{\RR^n} x \cdot f \ d\lambda \\ &= \int_{\RR^n} x \cdot f(x) \ dx. \end{align}