Is $E[Bin(X,p)]=E[X]p$?

69 Views Asked by At

We have given some random variable $X$ with mean $E[X]=:\mu$. Now we are interested in a random variable $Y \sim Bin(X,p)$.

Is it true that $$E[Y]=\mu p?$$

What confuses me is that normally the expectation of a Binomial random variable $Bin(n,p)$ is $np$. Now, if we have $Bin(X,p)$, is the expectation $Xp$ or $E[X]p$?

What further might confuse me: If we have a set of random variables $X_i$ with mean $E[X_i]=\mu_i$. Then, if we consider $Y:=\sum_i Y_i$ with $Y_i \sim Bin(X_i,p_i)$. Then, $E[Y]=\sum_i \mu_i p_i$. But if we restate the Problem and consider $Z:=\sum_i \sum_{j=1}^{Y_i} Z_i$ with $Z_i \sim Be(p_i)$, then it is not that easy to compute the expectation, or is it? But as $Z=Y$, we should have that $E[Z]=E[Y]$?

3

There are 3 best solutions below

1
On BEST ANSWER

Define $\xi_i$ to be a family of independent identically distributed Bernoulli variables with success probability $p$. Then $E[\xi_i]=p$ for all $i$. Note that $$Y=\sum_{i=1}^X\xi_i $$

Wald's Identity states that if $N$ is positive integer-valued random variable with finite expectation and $\eta_i$ are independent identically distributed random variables independent of $N$ then $$E\left[\sum_{i=1}^N\eta_i \right]=E[N]E[\eta_1] $$

Apply Wald's identity to $Y$ and you immediately get your result. Proving Wald's identity is easy. Note that what is happening is that even though the expectation operator is linear (it opens sums), you cannot do this at this time since $N$ is random. But if you use the proper of double expectation $$ E\left[\sum_{i=1}^N\eta_i \right]=E\left[E\left[\sum_{i=1}^N\eta_i \mid N\right]\right] $$ Then inside the expectation $N$ can be treated "in practical terms" as a number, not a random variable, so you can open the sums obtaining $$E\left[E\left[\sum_{i=1}^N\eta_i \mid N\right]\right] =E\left[\sum_{i=1}^N E\left[\eta_i \mid N\right]\right] $$ But since the $\eta_i$ are independent of $N$ you can disregard the conditioning on $N$ obtaining $$E\left[\sum_{i=1}^N E\left[\eta_i \mid N\right]\right]=E\left[\sum_{i=1}^N E\left[\eta_i\right]\right] $$ And since they are identical and there are $N$ of them you get $$ E\left[\sum_{i=1}^N E\left[\eta_i\right]\right] =E\left[N E\left[\eta_1\right]\right] $$ Finally, since $E\left[\eta_1\right]$ is a constant, you take it outside the expectation (the expectations is in terms of the density of $N$) and obtain the final result $$ E\left[\sum_{i=1}^N\eta_i \right]=E\left[N\right] E\left[\eta_1\right] $$ Observation: We used that $N$ has finite expectation because otherwise taking double expectation is not allowed.


About your confusion

Using the reasoning above: $$ E\left[\sum_{i=0}^n Y_i \right] = E\left[\sum_{i=0}^n \sum_{i=0}^{X_i} \xi_i \right] $$ But $n$ is a number and can be taken outside the expectation, hence:

$$ E\left[\sum_{i=0}^n \sum_{i=0}^{X_i} \xi_i \right] = \sum_{i=0}^n E\left[ \sum_{i=0}^{X_i} \xi_i \right] = \sum_{i=0}^n E[X_i]p$$ and assuming the $X_i$ are identical, it becomes $$ E\left[\sum_{i=0}^n Y_i \right] = nE[X_1]p $$

2
On

$E[Y] = E[E[Y|X]]$

The conditional expectation is $pX$. Then, $E[pX] = pE[X] = \mu p$

4
On

Formally, whenever you have a parametrized family of distributions $D(\theta)$ for the parameter set $\theta\in \Theta$ and some random variable $X$ which takes values in $\Theta$ and has the distribution $q$, if you define a new random variable as $Y \sim D(X)$, then the distribution of $Y$ is given by $$ \mathsf P(Y\in A) = \int_\Theta D(A|\theta)q(\mathrm d\theta) $$ which you can think of as a equation for the law of total probability: $$ \mathsf P(Y\in A) = \int_\Theta \mathsf P(Y\in A|X = \theta)\mathsf P(X \in \mathrm d\theta) $$ only the former equation is a bit more precise. So in your case $\Theta = \Bbb N$ so all integrals turn into sums. In particular, $$ \mathsf E Y = \sum_{n\in\Bbb N}n\cdot p\cdot q(n) = p\sum_{n\in\Bbb N}n\cdot q(n) = p\cdot\mathsf E X. $$ I think using the same approach you can answer the other question of yours.