$B_{1}, ... , B_{n}$ are many (~$10^6$) random variables following Bernoulli distributions.
$p_{i} = P(B_{i} = 1)$
Define $p_{avg} = \frac{1}{N}\sum_{n} p_{i}$
Can I say: $\frac{1}{N}\sum_{n}{p_{i}(1 - p_{i})} \approx p_{avg}(1 - p_{avg})$ ?
What is the error of this approximation?
Is this approximation unbiased?
Thank you.
Since $f(x)=x(1-x)=x-x^2$ is a strictly concave function of $x$, you will have $$\frac{1}{N}\sum_{n}{p_{i}(1 - p_{i})} \le p_\text{avg}(1 - p_\text{avg})$$ with equality only when all the $p_i$ are equal, which answers your bias question
The worst case will be when half the $p_i$ are $0$ and the other half $1$, in which case $p_\text{avg}=\frac12$ and