Proof of Expected Value Property for product of Independent Variables

2.8k Views Asked by At

I keep seeing this property come up for two random variables $X,Y$ in a probability space $(\Omega,\mathcal{M}, P)$. If two random variables are independent, then $\mathbb{E}[XY] = \mathbb{E}[X] \mathbb{E}[Y]$. But I have confusion on the space and measure that the expected value is calculated on.

If we start with the definition that two random variables are independent if for any $A,B\in\mathcal{M}$ we have that the events $\{X\in A\},\{Y\in B\}$ are independent. From this definition we can compute that if $X,Y$ are simple functions with values $x_1,\ldots x_n,y_1,\ldots y_m$ then $$ \mathbb{E}[XY] =\sum_{i,j}x_iy_j P(X = x_i, Y = y_j) =\Big(\sum_{i}x_i P(X=x_i)\Big)\Big(\sum_{j}x_i P(Y=y_j)\Big)=\mathbb{E}[X] \mathbb{E}[Y]$$

I would assume from here that we would probably need $X,Y\in L^2(P)$ to allow us to interchange the order of limits as needed above? We would then want to perform a normal limit argument by choosing any sequence of simple random variables $X_\alpha, Y_\beta$ such that $X_\alpha\to X$ and $Y_\beta \to Y$ to show that since $\mathbb{E}[X_\alpha Y_\beta] = \mathbb{E}[X_\alpha] \mathbb{E}[Y_\beta]$ then $\mathbb{E}[XY] = \mathbb{E}[X] \mathbb{E}[Y]$.

The above roughly makes sense to me until I start seeing definitions like this https://en.wikipedia.org/wiki/Independence_(probability_theory)#Two_random_variables
where it seems that the definition is based on the joint probability from the probability spaces. So if $Q = P\otimes P$ then the expected values above should be written as $\mathbb{E}_Q[XY] = \mathbb{E}_P[X] \mathbb{E}_P[Y]$

So is the first notation $\mathbb{E}[XY] = \mathbb{E}[X] \mathbb{E}[Y]$ just an abuse of notation for what is occurring? So talking about independent random variables will infer that we are workingin the product space?

2

There are 2 best solutions below

3
On BEST ANSWER

Assume that $X,Y \in \mathcal L^1$.

Let us denote the distribution of $X$ by $\mu$ and the distribution of $Y$ by $\nu$. Further denote the joint distribution of $X,Y$ by $\pi$. The probability measures $\mu$ and $\nu$ are therefore measures on $\Bbb R$, where $\pi$ is a measure on the product space $\Bbb R \times \Bbb R$ with $\sigma$-algebra $\mathcal B( \Bbb R ) \otimes \mathcal B( \Bbb R ) $. The point is, that the quantities $\Bbb E [X] , \Bbb E [Y] , \Bbb E [XY]$, if existing, only depend on $\mu, \nu , \pi$, respectively, since by integrating with the Pushforward measures $\mu ,\nu , \pi$ we have

$$\Bbb E [X]=\int_\Omega X(\omega) d P(\omega ) = \int_{\Bbb R} x d \mu (x), \quad \Bbb E [Y]=\int_\Omega Y(\omega) d P(\omega ) = \int_{\Bbb R} y d \nu (y)$$ and $$\Bbb E [XY] = \int_\Omega X(\omega)Y(\omega) d P(\omega ) = \int_{\Bbb R^2} xy d \pi (x,y)$$

Now note that $X,Y$ are independent if and only if $$\pi = \mu \otimes \nu$$ by the definition and uniqueness of the product measure on $\sigma$-finite measure spaces.

But this means that, if $X,Y$ are independent, by the Fubini-Tonelli theorem (the german page treats the general case with more arbitrary measures) we have

$$\int_{\Bbb R^2} xy d \pi (x,y) = \int_{\Bbb R} \int_{\Bbb R} xy d\mu (x) d \nu (y) = \int_{\Bbb R} x d \mu (x) \int_{\Bbb R} y d \nu (y)$$ since the righthand side above exists, due to $X,Y \in \mathcal L^1$.

0
On

This answer is based on the book Measures, Integrals and Martingales, by René L. Schilling. The author derives the Expected Value Property before defining product measures and product $\sigma$-algebras. So indeed, you can derive this property without the use of Fubini's theorem. Here is how Schilling does it:

First, let's start with the notion of independence. Let $(\Omega, \mathcal A, P)$ be a probability space and $\mathcal B, \mathcal C \subset \mathcal A$ be two sub-$\sigma$-algebras. We say that $\mathcal B$ and $\mathcal C$ are independent if $$ P(B \cap C)=P(B)P(C) \, \forall B \in \mathcal B, C \in \mathcal C $$

With that in mind, we move on to the Expected Value Property. So, let's first assume that $u = \mathbb I_B$ and $w = \mathbb I_C$. Because of the independence we have

$$ \int uw dP = P(A \cap B) = P(A)P(B) = \int u dP \int w dP $$

Now, for positive simple functions, such that $u = \sum_j \alpha_j \mathbb I_{B_j}$ and $w = \sum_i \beta_i \mathbb I_{C_i}$ we get

$$ \int uw dP = \sum_{j,i}\alpha_j \beta_i \int \mathbb I_{B_j} \mathbb I_{C_i} dP = $$

$$ = \sum_{j,i}\alpha_j \beta_i P(B_j \cap C_i) = \sum_{j,i}\alpha_j P(B_j)P(C_i) = $$

$$ = \left( \sum_j P(B_j) \right) \left( \sum_i P(C_i) \right) = \int u dP \int w dP $$

Now, for $u \in \mathcal M^+(\mathcal B)$ and $w \in \mathcal M^+(\mathcal C)$, we can use approximating simple functions $u_n \uparrow u$ and $w_n \uparrow w$, then, using the Monotone Convergence Theorem: $$ \int uw dP = lim_{n\to\infty} \int u_n w_n dP = lim_{n\to\infty} \int u_n dP \int w_n dP = \int u dP \int w dP $$

Finally, if $u \in L^1(\mathcal B)$ and $w \in L^1(\mathcal C)$, then $u \cdot w$ is integrable, because $$ \int \mid u w \mid dP \leq \int \mid u \mid dP \int \mid w \mid dP < + \infty $$ So, we split the positive and negative parts of each $u$ and $w$, and finally

$$ \int uw dP = \int u dP \int w dP $$