Conflicting definitions of the sum of random variables in proofs of additive property of expected value

72 Views Asked by At

While reading proofs of the reasonable assertion that expected value is linear, I came across two kinds of proofs: a one-liner and another that is hard to parse. In trying to decipher the latter and determine what makes the two proofs different, I began to wonder if I don't understand the definition of the sum of two or more random variables. Here is more information:

At the end of page 10 of these notes from MIT give a one-line proof of linearity. This seems to define the sum of two random variables $f,g:\Omega \to \mathbb{R}$ as the sum of these as functions with the same domain, so what you would expect $f+g$ to be. A proof that replicates the same idea for the sum of any finite number of random variables is given in page 12 of this handout from Chicago.

However, on page 241 of the PDF (labelled page 231 at the top of the page) of this book from Dartmouth, the proof is quite different and I was unable to fully understand it. It seems to be working with random variables that do not necessarily have the same sample space as their domains, and the domain of the sum of the random variables is something like a Cartesian product (though I could be mistaken about that last point). As far as I can tell, the same proof appears in this Brilliant article.

What seems to be different about the two proofs is that the former iterates over the elements of a sample space whereas the latter does some sort of double iteration. But maybe I'm missing something and they are both correct. Given a finite probability space, I have some questions:

  1. Am I correct in taking the definition of a random variables to be any real-valued function with the sample space as its domain?
  2. What is the definition of the sum of two random variables? What about the sum of $n$ random variables for some positive integer $n$?
  3. Where can I find a correct proof of the linearity of the expected value of random variables using the correct definition of the sum of random variables?
1

There are 1 best solutions below

0
On BEST ANSWER

In the MIT notes the two random variables are over the same sample space $S$. In the Dartmouth book they are taken over possibly different sample spaces $\Omega_X$ and $\Omega_Y$, and in order to combine combine the experiments that they represent into a single joint experiment, we must combine their sample spaces. Outcomes in the combined experiment are ordered pairs of outcomes, one from $\Omega_X$ and one from $\Omega_Y$, so the appropriate sample space is the Cartesian product $\Omega_X\times\Omega_Y$. Then

$$E(X+Y)=\sum_{\langle x,y\rangle\in\Omega_X\times\Omega_Y}(x+y)P(\langle X,Y\rangle=\langle x,y\rangle)$$

really is just the sum over the sample space as in the MIT notes. Since $\Omega_X=\{x_n:n\in\Bbb Z^+\}$ and $\Omega_Y=\{y_n:n\in\Bbb Z^+\}$, we can rewrite the summation over $\Omega_X\times\Omega_Y$ as the double summation

$$\sum_{j\ge 1}\sum_{k\ge 1}(x_j+y_k)P(\langle X,Y\rangle=\langle x_j,y_k\rangle)=\sum_{j\ge 1}\sum_{k\ge 1}(x_j+y_k)P(X=x_j\text{ and }Y=y_k)$$

and continue as in the text. It really is the same thing; the author of the MIT notes is trusting the reader to recognize that it may be necessary to construct a sample space that properly combines the natural spaces of two (or more) random variables, while the author of the Dartmouth book is showing explicitly how this is to be done, but without actually saying that that is what he’s doing.