Expectation of sum of random variables

708 Views Asked by At

I'm having trouble proving the following lemma for my statistics course:

Let $X_1,...,X_n$ be a random sample from $P$ to $\mathbb{R}$, $X$~$P$, $g$ measurable so that $\mathrm{E}g(X)$ and $\mathrm{var}g(X)$ exist. Then

$\mathrm{E}(\sum_{i=1}^ng(X_i))=n\cdot\mathbb{E}g(X)$

$\mathrm{var}(\sum_{i=1}^ng(X_i))=n\cdot\mathrm{var}g(X)$

I have a vague conception of its proof and know that it is directly related to $\sum_{i=1}^{n}{X_i}=n\bar{X}$ and correspondingly the random variables being i.i.d.

2

There are 2 best solutions below

3
On BEST ANSWER

The first is simply linearity of expectation applied to $g(X_i)$. So $\mathbb{E}[\sum_{i=1}^ng(X_i)] = \sum_{i=1}^n \mathbb{E}[g(X_i)] = n\cdot\mathbb{E}[g(X)]$

Assuming the $X_i$ are independent then so too are the $g(X_i)$, meaning $\mathbb{E}[g(X_i)g(X_j)]=\mathbb{E}[g(X_i)]\mathbb{E}[g(X_j)]=(\mathbb{E}[g(X)])^2$ when $j \not = i$, so:

$\mathrm{var}(\sum_{i=1}^ng(X_i)) \\= \mathbb{E}[(\sum_{i=1}^ng(X_i))^2] - (\mathbb{E}[\sum_{i=1}^ng(X_i)])^2 \\= \sum_{i=1}^n\mathbb{E}[(g(X_i)^2)]+\sum_{i=1}^n\sum_{j\not=i}\mathbb{E}[g(X_i)g(X_j)] - (n\cdot\mathbb{E}[g(X)])^2 \\=n\cdot\mathbb{E}[(g(X)^2)]+n(n-1)(\mathbb{E}[g(X)])^2 - n^2\cdot(\mathbb{E}[g(X)])^2 \\=n\cdot\mathbb{E}[(g(X)^2)] - n\cdot(\mathbb{E}[g(X)])^2 \\=n\cdot\mathrm{var}(g(X))$

0
On

What is your definition of expectation? For simplicity, let's assume that your random variables are continuous, i.e. $\mathbb{E}[X] = \int x f_X(x) \,\mathrm{d}x$ for some probability density function. (Of course, this works for general random variables that are discrete, continuous, mixed, etc.)

The first property follows from the linearity of expectation, and the fact that each $X_i$ is identically distributed (independence is actually not necessary). That is, we use the fact that for random variables $X$ and $Y$, $\mathbb{E}[X+Y] = \mathbb{E}[X] + \mathbb{E}[Y]$. This follows from the linearity of the integral: assuming the joint density exists for simplicity again, $ \iint (x + y)f_{X+Y}(x,y) \,\mathrm{d}x \,\mathrm{d}y = \int x f_X(x) \,\mathrm{d} x + \int x f_Y(y) \,\mathrm{d} y . $

The second property does indeed require independence. That comes into play through the following property: if $X$ and $Y$ are independent random variables, then $\mathrm{Var}(X+Y) = \mathrm{Var}(X) + \mathrm{Var}(y)$. If you can prove this, then your desired result follows. To do so, you can use the definition of the variance, and how independence implies $\mathbb{E}[XY] = \mathbb{E}[X] \mathbb{E}[Y]$.