Calculating the expectation of binomial distribution without calculating the summation

583 Views Asked by At

We know that expectation of a binomial distribution is $$\sum _{1}^{n}\left(\begin{array}{c}n\\ k\end{array}\right){p}^{k}{\left(1-p\right)}^{n-k}k = np$$

But while proving it, it is being written that: let

$$X= X_{1}+X_{2}+X_{3}+X_{4}+\dotsb+X_{n}$$

where,

$$ X_{i} =\begin{cases} 1,& \text{success at $i$th trial}\\ 0,& \text{otherwise.}\end{cases}$$

I am having difficulties in understanding this partition. How come the mapping $X$, which means the number of $k$ successes in $n$ trials be equal to a series of random variables which deal with success and failure of an $i^{th}$ flip. The rest of the proof is straightforward. But I cannot understand this partitioning.

Please help.

2

There are 2 best solutions below

0
On BEST ANSWER

Each random variable $X_i$ is either $1$ or $0$. It is $1$ if the $i$-th trial is a success and $0$ otherwise. Thus,

$$X_1+X_2+\ldots+X_n$$

is simply the total number of trials that were successes, i.e., the total number of successes.

Suppose, for example, that $n=5$, and we have successes on trials $1,2$, and $4$; then $X_1=X_2=X_4=1$, $X_3=X_5=0$, and

$$X_1+X_2+X_3+X_4+X_5=1+1+0+1+0=3\;.$$

Since we added $1$ for each trial that resulted in a success, we ended up with the total number of successes, in this case $3$.

The random variable $X$ is by definition the total number of successes in our $n$ trials, so

$$X=X_1+X_2+\ldots+X_n\;.$$

If exactly $k$ of the random variables $X_1,\ldots,X_n$ are equal to $1$, the sum of these random variables is $k$. But the only way for exactly $k$ of the random variables $X_1,\ldots,X_n$ to be equal to $1$ is for $k$ of the $n$ trials to be successes.

0
On

How come the mapping $X$, which means the number of $k$ successes in $n$ trials be equal to a series of random variables which deal with success and failure of an $i$th flip.

$X$ is the count of successes in a series of $n$ independent Bernoulli trials of identical success rate $p$.

The indicator random variables, denoted by $X_i$, are either $1$ if that $i$th trial is a success, else $0$ if that trial is a failure.   That is to say that $X_i$ is the count of successes on that particular trial.

Then the series, $~\sum\limits_{i=1}^n X_i~$, is the count of the successes in all $n$ trials.   Which is also what $X$ is.

$$X=\sum_{i=1}^n X_i$$

Then we just note that $\mathsf E(X_i)=p$ and use Linearity of Expectation.

$$\mathsf E(X)~=~\mathsf E(\sum_{i=1}^n X_i) ~=~ \sum_{i=1}^n\mathsf E(X_i) ~=~ n\,p$$

That is all.


PS: $~\mathsf E(X_i) ~=~ 1\cdot p + 0\cdot(1-p) ~=~ p~$.


PPS: We also use the colinearity of variance to obtain the variance.

$$\begin{align}\mathsf {Var}(X) =&~ \sum_{i=1}^n\mathsf {Var}(X_i) + 2\mathop{\sum\sum}\limits_{1\leq i < j \leq n}\mathsf {Cov}(X_i, X_j) \\[0.5ex] =&~ \sum_{i=1}^n (\mathsf E(X_i^2)-\mathsf E(X_i)^2) + 0 \\[0.5ex] =&~ n\,(p-p^2) \\[0.75ex] =&~ n\,p\,(1-p) \end{align}$$