I have been reading the following motivation behind the Poisson distribution, and have been confused why we assume that every disjoint interval has the same probability of success based on the information we are given. Here is the passage:
To see a concrete example of how Poisson distribution arises, suppose we want to model the number of cell phone users initiating calls in a network during a time period, of duration (say) 1 minute. There are many customers in the network, and all of them can potentially make a call during this time period. However, only a very small fraction of them actually will. Under this scenario, it seems reasonable to make two assumptions:
- The probability of having more than 1 customer initiating a call in any small time interval is negligible.
- The initiation of calls in disjoint time intervals are independent events.
Then if we divide the one-minute time period into $n$ disjoint intervals, then the number of calls $X$ in that time period can be modeled as a binomial random variable with parameter $n$ and probability of success $p$, i.e., $p$ is the probability of having a call initiated in a time interval of length $1 \over n$.
I have 3 questions here:
- Why would we expect that each disjoint interval has the same probability $p$ of having a customer call in that interval?
- Is the reasoning behind expecting either a call or not in the time intervals because $n$ is very large, so we use our 2nd assumption above?
- What would the sample points in this scenario look like? Would they simply be collections of cell phone users in this network, and the times at which they call are properties which the people whom we sample possess, which depend on the way we have sampled the people? And we sample the people from the general population of all people? I am having trouble thinking exactly of what is sampled here, so I would appreciate some help with this.
For a fairly circular reason why we make that assumption, it's so that we wind up with a Poisson distribution rather than something else.
As for a practical argument, the assumption is that once we get to a small scale there isn't much real distinction between any of the individual intervals. For example, if our period of observation is between 12:00 and 12:01, and we take 10 intervals, then each interval is a 100ms slice of that minute. Is there any reason why we would expect that a phone call is more likely to happen between 12:00:00.300 and 12:00:00.400 than between 12:00:00.700 and 12:00:00.800? The answer is that outside of some very precise automated dialing system it seems like a fair approximation to say that the two intervals are equally likely to have a call happen in them.
Of course if the scale is bigger then that idea falls apart - if we look at the period between 12:00 and 18:00, then the probability that a phone call happens close to 12:00 is likely to be quite different to the probability of one happening closer to 18:00.
As for what we're sampling in this model - we are sampling the time intervals, and our variable of interest is $X_i = 1$ if a phone call happened in interval $i$, and $0$ if it didn't.