The exact binomial test formula

3.3k Views Asked by At

Could you help me find the one-sided exact binomial test formula?

I use this statistical test in R-language, but I can't find the formula for it.

Eng Wikipedia (https://en.wikipedia.org/wiki/Binomial_test) and RLang help gives me only examples without needed math description.

Thank you!

2

There are 2 best solutions below

1
On BEST ANSWER

You did not state or show a particular test of interest to you, or say what you have tried. I will try to clarify the specific example in Wikipedia, which you have tried to understand.

A die is rolled 235 times and shows face six 51 times. If the die is fair, then then number $X$ of sixes seen is $X \sim Binom(235, 1/6),$ so that $E(X) = np = 235(1/6) = 34.17.$ Our observed number 51 of sixes seems a lot larger than 34, so we wonder if the die is unfair, specifically showing sixes more often than it 'should'.

Accordingly, we wish to test $H_0; p = 1/6$ against the right-sided alternative $H_a: p > 1/6$. The P-value of the test is $P(X > 51) = 1 - P(X \le 50).$ Wikipedia says this is .027. In R, this probability can be computed in either of two ways, using the binomial CDF pbinom or PDF dbinom, respectively:

  > 1 - pbinom(50, 235, 1/6)
  [1] 0.02654425
  > sum(dbinom(51:235, 235, 1/6))
  [1] 0.02654425

Either way, this amounts to $\sum_{i=51}^{235} {235 \choose i} p^i (1-p)^{235-i}.$ This is also the 'formula' used in binom.test to obtain the P-value:

 > binom.test(51, 235, 1/6, alte="g")

         Exact binomial test

 data:  51 and 235 
 number of successes = 51, number of trials = 235, p-value = 0.02654
 alternative hypothesis: true probability of success is greater than 0.1666667 
 95 percent confidence interval:
   0.1735253 1.0000000 
 sample estimates:
 probability of success 
              0.2170213 

The difference between just using the CDF or PDF and using binom.test is that the latter prints a lot of additional information about the test and an associated 95% confidence interval. The help screen at ? binom.test gives some explanation and a reference for how the CI is found. This style of CI is notoriously conservative (possibly giving a smaller lower bound than necessary) but it guarantees 95% coverage probability.

The one-sided version of the 'Agresti-Coull' (sometimes called 'plus-4') style of CI, based on the normal approximation to the binomial (applicable here), also provides a lower bound. It can be found as:

 > n = 236;  x = 51
 > n.adj = n+4;  p.hat = (x+2)/n.adj
 > z.star = qnorm(.95)
 > p.hat - z.star*sqrt(p.hat*(1-p.hat)/n.adj)
 [1] 0.1767911
0
On

Does this help demonstrate the $p$-value calaculations?

> binom.test(51,235,(1/6),alternative="less")
        Exact binomial test
data:  51 and 235
number of successes = 51, number of trials = 235, p-value = 0.982
> pbinom(q=51, size=235, prob=1/6)
[1] 0.9820227
> 
> binom.test(51,235,(1/6),alternative="greater")
        Exact binomial test
data:  51 and 235
number of successes = 51, number of trials = 235, p-value = 0.02654
> 1 - pbinom(q=51 - 1, size=235, prob=1/6)
[1] 0.02654425
> 
> binom.test(51,235,(1/6),alternative="two.sided")
        Exact binomial test
data:  51 and 235
number of successes = 51, number of trials = 235, p-value = 0.04375
> pbinom(q = 2*235*1/6 - 51, size=235, prob=1/6) + 
+    1 - pbinom(q = 51 - 1, size=235, prob=1/6)
[1] 0.04374797
> 

The two-sided formula would change slightly if $\dfrac{\text{successes}}{\text{trials}} \lt \dfrac16$