Dumb question: Computing expectation without change of variable formula

401 Views Asked by At

Possibly related question: Making sense of measure-theoretic definition of random variable

Given a random variable $X$ on $(\Omega, \mathscr{F}, \mathbb{P})$, its law $\mathcal{L}_X$ and a Borel function $g: \mathbb{R} \to \mathbb{R}$,

  1. $$E[g(X)] := \int_{\Omega} g(X(\omega)) d\mathbb{P}(\omega)$$

  2. Change of variable theorem allows us to compute as follows:

$$E[g(X)] = \int_{\mathbb{R}} g(t) d\mathcal{L}_X(t)$$

Dumb question: Without using change of variable theorem, how do we compute $E[g(X)]$?

-

Side question: The point of change of variable is to go back to Riemann or Riemann-Stieltjes integrals to avoid the Lebesgue integral?

-

I guess the answer is to use the measure-theoretic definition of expectation for measurable functions. Since the proof of the change of variable formula is actually to go through indicator, step, nonnegative and measurable functions. It seems like we would end up reinventing the wheel. Humour me anyway, please. How exactly would we be reinventing the wheel?


Say for example $g(x) = x^2$ and $X$ is Unif([0,1]). Then how do we compute

$$\int_{\Omega} X(\omega)^2 d\mathbb{P}(\omega) \tag{*}$$

?


Here's what I got so far.

$$ (*) = \int_{\Omega} (X(\omega)^2)^{+} d\mathbb{P}(\omega) - \int_{\Omega} (X(\omega)^2)^{-} d\mathbb{P}(\omega)$$

where we compute $$\int_{\Omega} (X(\omega)^2)^{+} d\mathbb{P}(\omega) = \sup_{h \in SF^{+}, h \le (X^2)^{+}}\{\int_{\Omega} h d \mathbb P\}$$

and where we compute $$\int_{\Omega} h d \mathbb P = \int_{\Omega} a_11_{A_1} + \cdots + a_n1_{A_n} d \mathbb P = \int_{\Omega} a_11_{A_1} d \mathbb P + \cdots + \int_{\Omega} a_n1_{A_n} d \mathbb P$$

where $A_1, ..., A_n \in \mathscr F$

and finally where we compute

$$\int_{\Omega} a_11_{A_1} d \mathbb P = a_1\int_{\Omega} 1_{A_1} d \mathbb P = a_1 \mathbb P(A_1)$$.


Without using change of variable formula, would we have to come up with indicator and simple functions that lead to a uniformly distributed random variable?

If so, what are these indicator and simple functions that lead to an uniform distribution please?

If not, what to do?


As for the probability space, I was thinking that $X$ being distributed as 'Unif(0,1)' means $X$ is in $(\Omega, \mathscr F, \mathbb P) = ([0,1], \mathscr B[0,1], \lambda)$ or $([0,1], \mathscr M[0,1], \lambda)$?


Actually, I was hoping there would be a way to define $X$ explicitly. For a discrete uniform distribution, say, where $X$ represents toss of a fair die, I guess we would have

$(\Omega, \mathscr F, \mathbb P) = (\{1, \dots ,6\}, 2^{\Omega}, \mathbb P(\omega) = \frac16)$ and $X = \sum_{n=1}^{6} n \cdot 1_{\{\omega = n\}}(\omega)$

Then

$$E[X] = \int_{\Omega}\int_0^1 n 1_{\{(\omega)=n\}}(\omega)dnd\mathbb P(\omega)$$

$$ = \int_0^1 n \int_{\Omega} 1_{\{(\omega)=n\}}(\omega)d\mathbb P(\omega)dn \tag{by Fubini's?}$$

$$ = \int_0^1 n \mathbb P(\{(\omega) = n\}) dn$$

$$ = \int_0^1 n f_X(n) dn$$

$$ = \int_0^1 n \frac11 dn$$

$$ = \int_0^1 (n) dn$$

$$=\frac{n^2}{2} |_{0}^{1}$$

$$=\frac12 - 0 = \frac12$$

As for the second moment,

$$E[X^2] = \int_{\Omega} (\int_0^1 n 1_{\{n = \omega\}}(\omega)dn)^2 d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 n 1_{\{n = \omega\}}(\omega)dn \int_0^1 m 1_{\{m = \omega\}}(\omega)dm d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 \int_0^1 n m 1_{\{n = m = \omega\}}(\omega)dn dm d\mathbb P(\omega)$$

$$E[X^2] = \int_{\Omega} \int_0^1 \int_0^1 n^2 1_{\{n = n = \omega\}}(\omega)dn dn d\mathbb P(\omega) \tag{??}$$

$$E[X^2] = \int_0^1 \int_0^1 n^2 dn dn \tag{??}$$

$$E[X^2] = \frac13$$

I think I can do similarly for discrete uniform, but both discrete and continuous uniform are simple random variables. What does $X$ ~ $N(\mu,\sigma^2)$ look like? I guess it would be $X=X^+ - X^-$ where $X^{\pm} = \sup\{\text{simple functions}\}$. Should/Can we use central limit theorem? I'm thinking bernoulli is indicator, binomial is simple and then use binomial to approximate normal?

I guess I'm not making much sense, but what references/topics can I look up for something similar which does? For example, where can I read on explicit representations of or approximations with simple functions for random variables to compute such integrals without change of variable formula?

2

There are 2 best solutions below

5
On BEST ANSWER

This is too long for a comment, so I'll post here in an attempt to make this as basic as possible. For your die roll example, let $\Omega = \{1,2,\dots, 6\}$, $\mathscr F = 2^\Omega$ and $\mathbb P$ be the (normalized) counting measure.

We may define the random variable $X:\Omega \longrightarrow [0,+\infty)$ as $X(\omega) = \omega$. In other words, $X$ is the result of a die roll and it is uniform because of the probability measure we've chosen. We'd have

\begin{align} \mathbb E(X) &= \int_{\Omega} X(\omega) \,d\mathbb P(\omega) \\&= \int_0^\infty\mathbb P\Big(X^{-1}\big(t, +\infty\big)\Big)\, dt \\&= \int_{0}^1\mathbb P\Big(\{1,2,3,4,5,6\}\Big)\, dt +\int_{1}^2\mathbb P\Big(\{2,3,4,5,6\}\Big)\, dt +\int_{2}^3\mathbb P\Big(\{3,4,5,6\}\Big)\, dt +\int_{3}^4\mathbb P\Big(\{4,5,6\}\Big)\, dt +\int_{4}^5\mathbb P\Big(\{5,6\}\Big)\, dt +\int_{5}^6\mathbb P\Big(\{6\}\Big)\, dt \\&= 1+\frac56+\frac46+\frac36+\frac26+\frac16=3.5 \end{align}

That said, I think the formalization of probability is in general very messy and I may not be able to help with harder examples.


In a similar vein, for the 'Unif(0,1)' example we have $\Omega = [0,1]$, $\mathscr F$ can be one of the Borel or Lebesgue-measurable subsets of $[0,1]$, and $\mathbb P$ is the Lebesgue measure $\mu$.
The random varialbe $X : \Omega \longrightarrow [0,+\infty)$ is defined as $X(\omega) = \omega$. Then

\begin{align} \mathbb E(X) &= \int_{\Omega} X(\omega) \,d\mathbb P(\omega) \\&= \int_0^\infty\mathbb P\Big(X^{-1}\big(t, +\infty\big)\Big)\, dt \\&= \int_{0}^1\mathbb \mu\Big((t,1]\Big)\, dt \\&= \int_0^1\,1-t \,dt = {\left[t-\frac{t^2}2\right]}_0^1 = 1-\frac12 = \frac12 \end{align}

1
On

This was actually pretty basic (as I suspected): Use Skorokhod representation (so-called in David Williams' Probability with Martingales). (*)

For a given cdf $F$, the random variable can be explicitly represented by computing $$X(\omega) = \sup\{y \in \mathbb{R}: F(y) < \omega\}$$ where $X \in \mathscr L^1 ((0,1),\mathscr B(0,1), \mu)$ where $\mu$ is Lebesgue measure.

Eg for exponential: $X \sim \text{Exp}(\lambda)$

$$F(y) < \omega$$

$$\iff 1-e^{-\lambda y} < \omega$$

$$\iff y < \frac{1}{\lambda} \ln(\frac{1}{1-\omega})$$

Thus, $$X(\omega) = \sup(y \in \mathbb{R}: F(y) < \omega) = \sup(-\infty,\frac{1}{\lambda} \ln(\frac{1}{1-\omega})) = \frac{1}{\lambda} \ln(\frac{1}{1-\omega})$$

$$\to E[X] = \int_0^1 \frac{1}{\lambda} \ln(\frac{1}{1-\omega}) d\mu(\omega) = \int_0^1 \frac{1}{\lambda} \ln(\frac{1}{1-\omega}) d\omega$$

It can be verified that this integral is the same as

$$E[X] = \int_{\mathbb R} \lambda x e^{-\lambda x} 1_{(0,\infty)} dx$$

Eg for continuous uniform distribution: $U \sim \text{Unif}((a,b))$

$$F(y) < \omega$$

$$\iff \frac{y-a}{b-a} < \omega$$

$$\iff y < a + \omega(b-a)$$

Thus, $$U(\omega) = a + \omega(b-a)$$

$$\to E[U] = \int_0^1 a + \omega(b-a) d\mu(\omega) = \int_0^1 a + \omega(b-a) d\omega$$

It can be verified that this integral is the same as

$$E[U] = \int_{\mathbb R} u 1_{(a,b)}\frac{1}{b-a} du$$


(*) This can also be called canonical representation (MAT 235A / 235B: Probability Instructor: Prof. Roman Vershynin Prof typeset by Edward D. Kim) or Skorokhod representation of random variables using quantile transforms (Optimal Transport Methods in Economics By Alfred Galichon).

Skorokhod representations relate to quantile functions, similarly defined:

$$Q(p) = \inf\{x \in \mathbb R | F(x) \ge p\}$$

In the wiki page for random variables under distribution functions, it says:

The probability distribution "forgets" about the particular probability space used to define X and only records the probabilities of various values of X. [...] In practice, one often disposes of the space $\Omega$ altogether and just puts a measure on $\mathbb {R}$ that assigns measure 1 to the whole real line, i.e., one works with probability distributions instead of random variables.