What's $r$ going to be when you get the summation of $36$ Geometric $X_i$'s

164 Views Asked by At

Let $X_1,X_2,\ldots,X_{36}$ be a random sample of size $n=36$ from the geometric distribution with the p.d.f:

$$f(x) = \left(\frac{1}{4}\right)^{x-1} \left(\frac{3}{4}\right), x = 0,1,2,\ldots$$

Now I'm trying to use normal approximation for $P\left(46 \leq \sum_{i=1}^{36}X_i \leq 49\right)$

Let's say:

$$W=\sum_{i=1}^{36}X_i$$

$$\therefore W \sim \text{Negative Binomial}$$

Does $p$ stay the same as $\frac{3}{4}$

What's $r$?

I'm not too sure. I'm trying to calculate $E[W]$ and $Var(W)$

2

There are 2 best solutions below

0
On BEST ANSWER

You have a problem with your geometric PMF: the sum of from $x = 0$ to $\infty$ is not equal to $1$. As such, you must write either

$$\Pr[X = x] = (1/4)^x (3/4), \quad x = 0, 1, 2, \ldots,$$ or $$\Pr[X = x] = (1/4)^{x-1} (3/4), \quad x = 1, 2, 3, \ldots.$$ Which one you mean, I cannot tell, and because the supports are different, the resulting probability will be very different. I will assume the latter.

That said....


When you can get the exact distribution of the sum, why use an approximation to get the probability, especially when the number of terms is tractable?

A geometric distribution counts the random number of trials in a series of independent Bernoulli trials with probability of "success" $p$ until the first success is observed. In your case, the probability of success is $p = 3/4$ and we are counting the total number of trials, are observed, including the success. The PMF is $$\Pr[X = x] = (1-p)^{x-1} p, \quad x = 1, 2, 3, \ldots.$$

The negative binomial distribution counts the random number of trials in a series of independent Bernoulli trials with probability of success $p$ until the $r^{\rm th}$ success is observed, for $r \ge 1$. When $r = 1$, we get a geometric distribution. Under this definition, we see that the sum of $r$ IID geometric variables $$S_r = X_1 + X_2 + \cdots + X_r$$ is negative binomial with parameters $p$ and $r$, where $p$ is inherited from the underlying geometric distribution for the individual $X_i$s.

It is not difficult to reason that $$\Pr[S_r = x] = \binom{x-1}{r-1} (1-p)^{x-r} p^r, \quad x = r, r+1, r+2, \ldots.$$ This is because in any sequence of $x$ trials such that the $r^{\rm th}$ success is observed on the final trial, there are $\binom{x-1}{r-1}$ ways to choose which of the $x-1$ trials are counted among the $r-1$ previous successes.

It follows that in your case, $r = 36$, and $$\Pr[46 \le S_{36} \le 49] = \sum_{x=46}^{49} \binom{x-1}{35} (1/4)^{x-36} (3/4)^{36},$$ a sum requiring only four terms.


By comparison, using a normal approximation with continuity correction, the mean is $\mu = r/p = 48$, and standard deviation is $\sigma = \sqrt{r(1-p)/p^2} = 4$; we find $$\Pr[46 \le S_{36} \le 49] \approx \Pr\left[\frac{46 - 48 - 0.5}{4} \le \frac{S_{36} - \mu}{\sigma} \le \frac{49 - 48 + 0.5}{4}\right] \approx \Pr[-0.625 \le Z \le 0.375] \approx 0.380184.$$ This deviates from the precise probability above by about $0.008$.

0
On

The parameters $p$ if that is construed in the usual way, would remain the same, and $r$ would be $36$.

But that's not the best way to proceed. Find the expected value and variance of your geometric distribution. Multiply them by $36$. Those would be the expected value and variance of the distribution of the sum of independent random variables. Use the normal distribution with that same expected value and that same variance.