I am looking for a probability distribution that calculates the probability of $x$ total Bernoulli trials given fixed $k$ successes.
I have looked into negative binomial distribution: $$ P(X=x) = \binom{x-1}{k-1} p^k (1-p)^{x-k} $$ where:
- $x$ is the number of Bernoulli trials (not fixed)
- $k$ is the number of successes (fixed)
- $p$ is the probability of success (fixed)
But I've read that this particular negative binomial distribution calculates the probability of $x$ trials UNTIL the $k$th success. I might be wrong, but this is not the same as what I am looking for, which is the probability of $x$ trials, given $k$ successes.
If they are the same, can someone explain how they are equivalent? The Wikipedia page is confusing to me.
Your question is relevant to the concept of conditional probability. Recall the Bayes' theorem $$P(X=x|k) = \frac{P(X=x)P(k|X=x)}{P(k)}=\frac{P(X=x)P(k|X=x)}{\sum_i P(X=i)P(k|X=i)}$$
We can back out the desired distribution LHS from RHS. In particular we make use of $P(k|X)$, which is the binomial distribution.
Without conditioning on $k$ and without a given prior distribution of $X$, the number of trials $X$ can be assumed to follow a uniform distribution. Thus, $P(X=i)$ is a constant and the equation can be reduced to $$P(X=x|k) = \frac{P(k|X=x)}{\sum_i P(k|X=i)}$$
First of all, it is obvious that $X\geq k$.
For every $X\geq k$, $P(k|X=x) = C^x_kp^k(1-p)^{x-k}$.
Therefore,
$$P(X=x|k) = \frac{C^x_kp^k(1-p)^{x-k}}{\sum_{i=k}^\infty C^i_kp^k(1-p)^{i-k}}$$ $$=\frac{C^x_k(1-p)^{x-k}}{\sum_{i=k}^\infty C^i_k(1-p)^{i-k}}$$ $$=\underline{\underline{C^x_k(1-p)^{x-k}p^{k+1}}} (*)$$ ($*$)For the sum of the infinite series, please refer to the comment section below.
Now, compare this result with the negative binomial distribution $C^{x-1}_{k-1}p^k(1-p)^{x-k}$. As a matter of fact, they highly resemble each other. The differences lie in the binomial coefficient and the number of success trials:
$$C^x_k \text{ vs } C^{x-1}_{k-1}$$ $$p^{k+1} \text{ vs } p^k$$
The intuitive explanation is as follows.
The negative binomial is a distribution of the trials until the $k$th success. Therefore, the last trial must yield a success, which means that the permutation is applied to the first ($x$-1) trials only that contains ($k$-1) successes.
On the other hand, the desired distribution does not have this restriction, so the permutation includes one more successful trial.