Getting a probability $> 1$ in hypothesis test

91 Views Asked by At

I have the following hypothesis question.

A soft drinks company claims that of all consumers buying their product, $82 \%$ prefer the light version of the drink. To test their claim, data were collected from $52$ consumers, with $44$ preferring the light version of the product.

$H_0$: $p-p_0 = 0.82$, $H_1$: $p \neq 0.82$.

Under the null hypothesis, $X \sim \mathrm{Bin}(52,0.82)$. The observed test statistic is $x = 44$.

I need to find the value of $P(X \leq 44)$, but I get $1.48$?

Following is my code in R:

  > 2*pbinom(44,52,0.82)
  [1] 1.483675

If I run the following, reading from the right tail why is it <1?

  > 2*pbinom(44,52,0.82, lower.tail = FALSE)
  [1] 0.5163253

And why does it work for this example taking the left tail and obtaining p value <1 :

  > 2*pbinom(311,500,0.65)
  [1] 0.2065312
1

There are 1 best solutions below

2
On

Let's begin by temporarily putting the formulas aside and trying to take an intuitive view of the test of the null hypothesis $H_0: p = p_ = 0 = 0.82$ against the two sided alternative $H_a: p \ne 0.82,$ based on $n = 52$ observations with $x = 44$ Successes (people who prefer the sugarless version).

Exact binomial test: The test begins by assuming that the observed number of successes is $X \sim \mathsf{Binom}(n = 52,\, p = 0.82).$ The figure below shows the PDF of this distribution.

enter image description here

If $H_0$ is true, we expect on average $np_0 = 52(.82) = 42.64$ successes (vertical dotted blue line). We observed $x = 44$ Successes, slightly more than expected. The corresponding bar in the plot is shown in red. The question is whether the observed value $x = 44$ is enough different from the expected value (42 or 43) to cast doubt on the truth of the null hypothesis.

Now we need to do some computations: The P-value of a right-sided test (alternative $p > 0.82)$ is the sum of the heights of the bars at values 44 through 52. That is $P(X \ge 44) = 0.3920;$ computation in R below. This is the probability of an event as extreme or more extreme than what we observed, in an upward direction.

sum(dbinom(44:52, 52, .82))
## 0.3919817

For the P-value of a 2-sided test, we also need the probability of a result as or more extreme in a downward direction. In a symmetrical situation, we would just choose the probabilities of the bars as far below the dotted blue line as $s = 44$ is above. (But in this problem it's not exactly clear whether to use the combines heights of bars at or below 42 or to use the combined heights at or below 41. The two probabilities would be about 0.4844 and 0.3289, respectively.)

pbinom(42, 52, .82)
## 0.4644076
pbinom(41, 52, .82)
## 0.328853           # See Note (b) at end

In such a case, some statisticians double the one-sided P-value 0.3920 to get the two-sided P-value 0.7840. Consequently, testing at the 5% level of significance, we do not have evidence to reject $H_0$ against the two-sided alternative because the P-value $0.7840 > 0.05.$

2*sum(dbinom(44:52, 52, .82))
## 0.7839634

Normal approximation with continuity correction: An alternative method is to use the normal approximation to the normal distribution. Let $n = 52,\, x = 44,\,$ $p_0 = 0.82.\, \mu_0 = np_0 = 42.64,$ and $\sigma_0 = \sqrt{np_0(1-p_0)}.$ Then the test statistic is $Z_0 = (43.5 - \mu_0)/\sigma_0,$ where the use of 43.5 instead of 44 is called the 'continuity correction'. [Under the approximating normal curve, the probability associated with $x = 44$ lies above the interval $(43.5, 44.5).]$

Then under $H_0,$ the test statistic is approximately standard normal. The P-value is $P(|Z| \ge Z_0) = 0.76.$ Computations in R are shown below. Even though we approximated the two-sided P-value (0.784) of the 'exact' binomial test by doubling the one-sided P-value, that result is generally regarded as more accurate than the P-value (0.76) from the normal approximation. (Even with a continuity correction one does not expect more than two-place accuracy from a normal approximation when $n$ is below about 100.)

n = 52;  x = 44;  p.0 = .82; mu.0 = n*p.0;  sg.0 = sqrt(n*p.0*(1-p.0))
z = (43.5 - mu.0)/sg.0; z
## 0.3104228
p.val = 2*pnorm(-z);  p.val
## 0.7562395

Notes: (a) In R, dbinom denotes a binomial PDF and pbinom denotes a binomial CDF. Also, pnorm denotes a normal CDF. The notation 44:52 denotes a vector of integers from 44 through 52.

(b) Even though some statisticians double the (observed) one-sided P-value to get the P-value for the two sided test, there is not universal agreement on this. I showed the 'doubling method' because you mentioned it in your question.

Here is output from binom.test in R:

binom.test(44, 52, .82, alt="two")

        Exact binomial test

data:  44 and 52
number of successes = 44, number of trials = 52, p-value = 0.7208
alternative hypothesis: true probability of success is not equal to 0.82
95 percent confidence interval:
 0.7191889 0.9311608
sample estimates:
probability of success 
             0.8461538 

The P-value shown here is $P(X \ge 44 | p=.82) + P(X \le 41 | p=.82) = 0.7208$ (one of two choices mentioned above, but not used). Of course, the conclusion is the same: there is not evidence to reject $H_0.$

x = c(0:41, 44:52);  sum(dbinom(x, 52, .82))
## 0.7208348

(c) Not all statistical software uses the continuity correction. For example, Minitab 17, under the normal approximation option, omits the continuity correction to get P-value 0.623.