Probability for a train journey--Poisson approximation to binomial distribution

95 Views Asked by At

Suppose the random variable $T$ which represents the time needed for one person to travel from city A to city B ( in minutes). $T$ is normally distributed with mean $60$ minutes and variance $20$ minutes. Also, suppose $600$ people depart at the exact same time with each of their travel time being independent from one another.

Now the question is, what is the probability that less than $80$ people will need to travel more than $1$ hour ?

How I tried to do this is by using the binomial probability distribution to calculate the probability of $i$ people being late out of the 600. Then I summed $i$ from $0$ to $79$ because these are disjoint sets of events. But first I needed to know the probability that a random person will be late. This is simply equal to $1/2$ because $T$ is normally distributed with mean 60. So we get for $X$ the amount of people being late:

$$P(X < 80) = \sum\limits_{i=0}^{79} \frac{600!}{i!(600-i)!} \left(\frac{1}{2}\right)^i\left(\frac{1}{2}\right)^{600-i} =\sum\limits_{i=0}^{79} \frac{600!}{i!(600-i)!} \left(\frac{1}{2}\right)^{600} \approx 2.8^{-80} $$

But this probability is practically $0$, which seems to go against my intuition ( it's reasonably possible for less than $80$ people being late). So where did I go wrong in my reasoning ? Also, why did they give the variance which I didn't use (this was an exam question by the way). Has this maybe something to do with the CLT (central limit theorem) ?

2

There are 2 best solutions below

0
On BEST ANSWER

Your reasoning is not wrong at all. It may be surprising, but in general the probability that on $n$ trials you will deviate by even a smallish "constant" factor $\delta<1$ from the average number of "successes" $\mu$ is less than $e^{-\frac{\delta^2}{3}\mu}$ (a loose, but often effective form of Chernoff bound).

The key here is that if $\delta$ is not too small, you get a number passably smaller than $1$ (in your case $\approx 0.836$) raised to the $\mu$ (N.B. $\mu$, rather than $n$). With a gargantuan $\mu=600/2=300$, it's only to be expected that the probability of such a deviation is really tiny.

As for the variance, my guess is that it was just a "red herring". Or maybe there was some typo in the problem: if the mean had been any number of minutes other than $60$ (say, $40$) - then the variance would have been useful.

0
On

The number $X$ of people traveling less than an hour has $X \sim \mathsf{Binom}(600, 1/2),$ as you say, and you seek $P(X < 80) = P(X \le 79) \approx 4.376 \times 10^{-81},\,$ as computed in R.

pbinom(79, 600, 1/2)
## 4.37635e-81

enter image description here

For another view of this problem, as you also suggest, notice that $X$ is approximately distributed as $\mathsf{Norm}(\mu = 300, \sigma=\sqrt{150} \approx 12.25).$ Then $$P(X < 80) = P(X < 79.5) \approx P\left(Z \le \frac{79.5 - 300}{\sqrt{150}} = -16.534\right) \approx 0.$$

There is almost no probability under a normal curve below 16 standard deviations below the mean. [By the 'Empirical Rule', $2P(Z < -3) \approx 0.025$ for standard normal $Z$.]

pnorm(79.5, 300, sqrt(150))
## 9.10325e-73
pnorm((79.5-300)/sqrt(150), 0, 1)
## 9.10325e-73