How to visualise expected number of occurrences in Poisson and Binomial distributions?

361 Views Asked by At

The below are some sample experiments:

On the average, only 1 person in 1000 has a particular rare blood type.

$p = 1/1,000$, assume $n = 10,000$

An advertiser drops 10,000 leaflets on a city which has 2000 blocks. Assume that each leaflet has an equal chance of landing on each block.

$p = 1/2,000, n = 10,000$

In a class of 80 students, the professor calls on 1 student chosen at random for a recitation in each class period. There are 32 class periods in a term.

$p = 1/80, n = 32$

If I have $P(X = j)$ and asked to find the expected number of $n$ where $X = j$, I think the answer is $nP(X = j)$. However, I'm having trouble visualising what this means.

For the leaflets example, am I dropping the 10,000 leaflets over the 2000 blocks 2000 times and for a specific block, and $2000P(X = j)$ is the number of times I expect that block to have $j$ leaflets? I have also seen this described as 10,000 leaflets distributed over the 2,000 blocks and proportionally, on average, there will be $2000P(X = j)$ blocks with $j$ leaflets. This confuses me in that although I can see how each trial (throwing of a single leaflet) is independent of each other, all leaflets have to land in the 2000 blocks; how come the restrictions this adds (i.e. all blocks getting no leaflets is impossible) doesn't affect the probability distribution of the leaflets?

If both are correct, I'm also having trouble drawing the equivalency between the two, since, for the student example, having $32P(X = j)$ periods where a student is called 2, 3, ... times doesn't really make sense if we only have one student chosen per period (but does if the 32 is number of experiments). Similarly, for the blood type example, we can't have, say, both 0 and 4 people in 10,000 with that specific blood type, but this is possible for different groups of 10,000.

2

There are 2 best solutions below

0
On

b) This is just a Poisson distribution with average number of hits = 5.

enter image description here

0
On

The way to think about it is as follow:

You have 10000 leaflets (10000 chances to drop them one by one)

For a particular block, a leaflet will hit it with probability 1/2000. Here we can apply the binormial distribution. Let X be the number leaflets hitting a given block:

$ P(X=k) = binomial(n=10000, p=1/2000, k) = \binom{n}{k} p^k (1-p)^{n-k} $

Number of leaflet that will hit this particular block is E[X] = np = 10000/2000 = 5

Now you want to consider the entire population of blocks, then we have Xi for i = 1 ... 2000. So we are trying to constrain on the total number leaflets n = 10000. We will have the following joint PMF:

$P(X_1=k_1, X_2=k2, ... X_{2000}=k_{2000}) = \binom{10000}{k_1, k_2, ..., k_{2000}} (\frac{1}{2000})^{k_1}(\frac{1}{2000})^{k_2}...(\frac{1}{2000})^{k_{2000}} = \binom{10000}{k_1, k_2, ..., k_{2000}} (\frac{1}{2000})^{10000}$

The above expression can be used to calculate the marginal distribution for one particular Xi. This should help you connect the probability of the leaflet count on one single block to that leaflet count on all blocks.

I hope I did not make a mistake above. Please let me know if anything does not make sense.