The following was the problem that I was working on.
As a part of the underwriting process for insurance, each prospective policyholder is tested for high blood pressure. Let X represent the number of tests completed when the first person with high blood pressure is found. The expected value of X is 12.5 Calculate the probability that the sixth person tested is the first one with high blood pressure.
This was my approach.
I noticed that they did not tell me what kind of distribution X would be, so I would have to come up with a model of my own that fits this problem.
When I was in college I was not too good at the upper division statistics course, but I vaguely remembered that waiting time and Poisson distribution was closely related.
So I first assumed that X was Poisson with mean 12.5, and the probability that I got was approximately .02.
Since this was not in the answer choice I assumed that it was Binomial, soon noticing that there was no "number of trials" and I came to the conclusion that this was geometric.
Although I was able to reach the answer eventually, but was there any reason that the geometric distribution was used for this problem?
It would be great if someone could give me a good example to use for Poisson and Geometric.
When we have a random variable $X$, we should ask:
In your case, $X$ is a discrete random variable that can only take on values from $\{1, 2, 3, \ldots\}$, the positive integers. It is not limited to some upper bound, so we know right away that $X$ is not binomial. It is also not Poisson, because a Poisson variable has $\Pr[X = 0] > 0$, whereas here, we cannot observe $X = 0$ because at least one observation must be made in order for us to observe someone with high blood pressure. More important is the distinction that, although a Poisson variable represents the count of something, it counts an event that does not always occur, rather than counting the number of trials in order to see some event of interest. In other words, if we observe $N$ occurrences of some event over some time period $T$, then $N$ might be modeled as a Poisson variable; but if we count the number of trials $X$ needed to observe $r$ events of interest, then this is not Poisson, because this situation has nothing to do with time periods (the trials need not occur over equal intervals, for instance).
So suppose we can assume that the outcome of each trial is dichotomous: i.e., either we observe the event of interest ("success"), or we do not observe it ("failure"). In your case, this assumption clearly holds; either a randomly selected policyholder has high blood pressure (success), or they do not (failure). Furthermore, suppose we can assume that these trials are independent and identically distributed: the status of one policyholder does not influence the status of any other, and that throughout the process of taking a random sample, the probability that a policyholder has high blood pressure is constant and does not change from trial to trial. This too is also a reasonable assumption (although it can be violated under certain sampling methods).
With this in mind, the distribution of the number of trials $X$ needed to observe the $r^{\rm th}$ success has a negative binomial distribution with parameters $r$ and $p$, and probability mass function $$\Pr[X = k] = \binom{k-1}{r-1} p^r (1-p)^{k-r}, \quad k = r, r+1, r+2, \ldots.$$ This makes sense because if $k$ trials are needed to see $r$ successes, that means the $k^{\rm th}$ trial must be successful. Hence there are $\binom{k-1}{r-1}$ ways to have observed the remaining $r-1$ successes among the first $k-1$ trials, and we multiply this by the probability $p^r$ of observing $r$ successes and $(1-p)^{k-r}$ of observing $k-r$ failures.
A special case of the negative binomial distribution is when $r = 1$: that is, we are interested in the number of trials needed to observe the first success. Then $X$ has a geometric distribution with parameter $p$, and $$\Pr[X = k] = p(1-p)^{k-1}, \quad k = 1, 2, 3, \ldots.$$ This is simply the result of substituting $r = 1$ into the negative binomial distribution above.
It is not hard to show from the PMF that $${\rm E}[X] = \frac{1}{p}.$$ This simply means the expected value of the number of trials needed to observe the first success is the reciprocal of the probability of success for a single trial, which makes intuitive sense--if the chance that any given trial will be successful is $p$, then you should need to observe, on average, $1/p$ trials to see your first success (though this is not a rigorous argument).
So, here are some example scenarios and you should decide which parametric distribution is an appropriate model: