Binomial distribution confusion

73 Views Asked by At

Whats the difference between pbinom(2074000, 4247000, 0.5) and pbinom(2074, 4247, 0.5), why do they give differing values when the proportions are same?

3

There are 3 best solutions below

0
On

If you're just working on a command calculation for the probability density function of the binomial distribution, it's simple to answer this just because :

$$\binom{4247000}{2074000} \neq \binom{4247}{2074}$$

0
On

It isn't a question of equal proportions it's a question of the probability of getting an exact proportion out of a number of trials.

The probability of getting $1$ head out of two coin tosses is $\frac{1}{2}$. That is $2$ out of $4$ possible outcomes HH, HT, TH, HH.

Which is not the same as getting $2$ heads out of four coin tosses which is $\frac{3}{8}$ That is, $6$ out of $16$ possible outcomes HHHH, HHHT, HHTH, HTHH, THHH, HHTT, HTHT, HTTH, THHT, THTH, TTHH, HTTT, THTT, TTHT, TTTH, TTTT.

0
On

In the following, I use the fact that pbinom is defined as a cumulative distribution rather than a probability mass function. If it were a probability mass function the effect would be much more dramatic.


When you multiply the number of trials by $1000,$ the range of possible outputs of the binomial distribution also grows by a factor of $1000,$ and so does the variance of the distribution--but the standard deviation is only the square root of the variance.

In short, the shape of the binomial distribution for $4247000$ trials is not just like the distribution for $4274$ trials stretched laterally by a factor of $1000.$ The two extremes of the distribution are $1000$ times as far from the mean, but the probable outcomes are not that much further from the mean; you'll find a much higher probability of being within $\pm1\%$ of the mean value. To put it another way, the outcome $2047000$ is many more standard deviations below the mean of $4247000$ trials than the outcome $2047$ is below the mean of $4247$ trials.