Difference of two sample means from the same binomial distribution

85 Views Asked by At

Here is the question I am trying to answer: There are 4000 birds chirping in a forest and a bird either sings with a 30% probability or doesn't with an 70% probability. Each one is independent of the other and overall noise is proportional to the number of birds. The question then asks if you listen to the birds chirping in the forest at two different times, what's the probability that the noise at the second time is more than 3% larger than the noise at the first time.

It is clear to me that this is a binomial distribution with n=4000 and p=0.3, and these noise levels are sample means but I don't know where to do from there. I don't know how to determine the probability that the second sample mean will be 3% higher. Would Bayes theorem help? Or approximating the distribution as a normal distribution? And if I do approximate the distribution with a normal how do I determine the probability that X2=1.03X1 in a normal distribution? I only know how to determine the probability that a point on normal distribution is above a certain value. Any help would be great!

1

There are 1 best solutions below

0
On BEST ANSWER

It rather depends on what tools you have.

You could use R to work out the probabilities of each number of chirping birds in the first sample and multiply this by the probability of more than $1.03$ times as many being in the second sample, i.e.

$$\sum_{n=0}^{4000} \sum_{m=\lfloor 1.03n\rfloor+1}^{4000} {4000 \choose n} {4000 \choose m}0.3^{n+m} 0.7^{8000-n-m} $$

with something like

b <- 4000
p <- 0.30
n <- 0:b
sum(dbinom(n,b,p)*(1-pbinom(n*1.03,b,p)))
# 0.1934043

or you could try a normal approximation to an uncorrelated noncentral normal ratio which here would have a central value of $1$ and in this context a variance of around $\frac{2 \times 4000 \times 0.3 \times 0.7}{(4000 \times 0.3)^2}=\frac{7}{6000}$ so

1-pnorm(1.03,1,sqrt(7/6000))
# 0.1898877

which is not bad in this particular case. You would get the same answer with the crude and theoretically wrong approach of saying that the mean number of chirping birds of $4000\times 0.3 =1200$ and $3\%$ of this is $36$, and the difference between these two i.i.d. binomial distributions would have mean $0$ and variance of the difference $2\times 4000 \times 0.3 \times 0.7=1680$ with a normal approximation

1-pnorm(36,0,sqrt(2*4000*0.3*0.7))
# 0.1898877    

A better approach might be Gábor Pálovics's suggestion of $P(X_2 - 1.03 X_1 > 0)$ where the distribution has mean $4000 \times 0.3 - 1.03 \times 4000 \times 0.3= -36$ and variance $(1+1.03^2) \times 4000 \times 0.3 \times 0.7 = 1731.156$ and then use a normal approximation

1-pnorm(0,-36,sqrt(1731.156))
# 0.1934547

which is spectacularly close