Deriving the variance of a binomial distribution

3k Views Asked by At

I know that the variance of a binomial distribution is the number of trials multiplied by the variance of each trial, but I'm not seeing the derivation of this. Here's my logic so far:

For each trial ($x$), $p$ = probability of success (1), and $1-p$ = probability of failure (0):

$$E(x) = 1\cdot p+0\cdot(1-p) = p$$ $$E(x^2) = 1^2\cdot p+0^2\cdot(1-p) = p$$ $$Var(x) = E(x^2)-E(x)^2 = p - p^2 = p(1-p)$$

From here, for any combination of trials ($X$):

$$X = x_1 + x_2 + \cdots + x_n$$ $$E(X) = E(x_1) + E(x_2) + \cdots + E(x_n)$$ $$E(X) = np$$ $$E(X^2) = E(x_1^2) + E(x_2^2) + \cdots + E(x_n^2)$$ $$E(X^2) = np$$

By this, the logic indicates the variance would be:

$$Var(X) = E(X^2) - E(X)^2 = np - (np)^2 = np(1-np)$$

...however, this is not correct, since the variance is as follows:

$$Var(X) = Var(x_1) + Var(x_2) + \cdots + Var(x_n)$$ $$Var(X) = p(1-p) + p(1-p) + \cdots + p(1-p)$$ $$Var(X) = np(1-p)$$

I'm not seeing in my derivation where I'm missing the mark mathematically, and resulting in the incorrect "n" in the parentheses.

2

There are 2 best solutions below

0
On

Your computation of $\mathbb{E}[X^2]$ is incorrect. In particular, $$ \mathbb{E}[X^2] \neq \mathbb{E}[X_1^2] + \dots + \mathbb{E}[X_n^2] $$ since $\mathbb{E}[X^2] = \mathbb{E}[(X_1+\dots+X_n)^2]$.

2
On

$E [X_i X_j] = p^2$. ${}{}{}{}{}{}{}{}{}$

So $E X^2 = \sum_{i=1}^n E X_i^2 + \sum_{i \neq j} E[X_i X_j] = np +(n^2-n)p^2$, and so $\operatorname{var} X = np(1-p)$.